Methods for Enhancing Bacterial Cell Display of Proteins and Peptides

ABSTRACT

Methods of making and using bacterial display polypeptide libraries using circularly permuted OmpX (CPX) variants are disclosed. The invention further relates to methods for enhancing the display of proteins and peptides at the surface of bacteria by optimizing linkers and incorporating mutations at positions 165 and 166 of CPX.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.15/613,877, filed Jun. 5, 2017, now U.S. Pat. No. 10,640,762, whichapplication is a continuation of U.S. patent application Ser. No.14/717,679, filed May 20, 2015, now U.S. Pat. No. 9,695,415, whichapplication is a continuation of U.S. patent application Ser. No.13/615,072, filed Sep. 13, 2012, now U.S. Pat. No. 9,062,107, whichapplication is a continuation of U.S. patent application Ser. No.12/220,448, filed Jul. 24, 2008, now U.S. Pat. No. 8,293,685, whichapplication claims the benefit of U.S. Provisional Application No.60/962,086, filed Jul. 26, 2007, which applications are incorporated byreference herein in their-entireties.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with government support under U.S. Army ContractDAAD-19-03-D-0004 awarded by the Institute for CollaborativeBiotechnologies. The government has certain rights in the invention.

TECHNICAL FIELD

The invention is in the field of protein engineering. In particular, thedisclosure relates generally to methods of making and using bacterialdisplay polypeptide libraries, including methods for enhancing thedisplay of proteins and peptides at the surface of bacteria by usingvectors encoding circularly permuted OmpX (CPX) variants containingoptimized linkers and selected mutations at positions 165 and 166.

BACKGROUND

Display methodologies have proven invaluable for the discovery,production, and optimization of proteins and peptides in a variety ofbiotechnological applications. Various approaches including phagedisplay (Smith, G. P. (1985) Science, 228, 1315-1317), mRNA (Wilson etal. (2001) Proc. Natl. Acad. Sci. USA, 98, 3750-3755) and DNA display(Yonezawa et al. (2003) Nucleic Acids Res., 31, e118), ribosome display(Hanes, J. & Pluckthun, A. (1997) Proc. Natl. Acad. Sci. USA, 94,4937-42), eukaryotic virus display (Bupp, K. & Roth, M. J. (2002) Mol.Ther., 5, 329-335; Muller et al. (2003) Nat. Biotechnol., 21:1040-1046),yeast display (Boder, E. T. & Wittrup, K. D. (1997) Nat. Biotechnol.,15, 553-557), and bacterial display (Lu et al. (1995) Biotechnology(NY), 13, 366-372) have been developed to screen diverse molecularrepertoires for desired activities. In particular, bacterial displaylibraries have enabled antibody affinity maturation (Daugherty et al.(2000) Proc. Natl. Acad. Sci. USA, 97, 2029-2034), the discovery ofprotein binding peptides (Bessette et al. (2004) Protein Eng. Des. Sel.,17, 731-739), cell-specific ligands (Dane et al. (2006) J. Immunol.Methods, 309, 120-129; Nakajima et al. (2000) Gene, 260, 121-131), andthe identification of optimal protease substrates (Boulware, K. T. &Daugherty, P. S. (2006) Proc. Natl. Acad. Sci. USA, 103, 7583-7588). Oneof the key advantages of bacterial surface display is the ability to useflow cytometry for quantitative screening of the libraries, allowing forreal-time analysis of binding affinity and specificity to optimize thescreening process (Wittrup, K. D. (2001) Curr. Opin. Biotechnol., 12,395-399). Additionally, the ease of genetic manipulation, hightransformation efficiency, and rapid growth rate make E. coli awell-suited host for display. A broad range of bacterial surface displaysystems have been developed allowing for insertional or terminally fusedpeptides and proteins to be displayed on the cell surface. Several outermembrane proteins and cellular appendage proteins have been used topresent polypeptides as insertional fusions (Bessette et al. (2004)Protein Eng. Des. Sel., 17, 731-739; Charbit et al. (1986) Embo J., 5,3029-3037; Taschner et al. (2002) Biochem. J., 367, 393-402). The icenucleation protein (Jung et al. (1998) Nat. Biotechnol., 16, 576-580),intimins (Christmann et al. (1999) Protein Eng., 12, 797-806), andLppOmpA (Francisco et al. (1992) Proc. Natl. Acad. Sci. USA, 89,2713-2717) have been used to display proteins on the C-terminus of atransmembrane scaffold while N-terminal display has been accomplishedusing autotransporters IgA1 protease and EstA (Maurer et al. (1997) JBacteriol, 179, 794-804).

Recently, a unique bacterial display scaffold was developed that allowsfor N- and/or C-terminal display from a circularly permuted variant ofouter membrane protein OmpX (CPX) (Rice et al. (2006) Protein Sci., 15,825-836). This scaffold enables display of peptides on both termini, butwith reduced efficiency when compared to that obtained using insertionsinto OmpX. Reduced membrane localization of CPX may result from slowerfolding rates and reduced stability that has been described previouslyfor circularly permuted proteins (Heinemann, U. & Hahn, M. (1995) Prog.Biophys. Mol. Biol., 64, 121-143). Regardless, reduced displayefficiency requires longer induction times to achieve sufficient displayfor screening by FACS. Importantly, inefficient display can create anundesired selection pressure resulting in growth biases, reducedviability, or differing levels of passenger localization on the cellsurface. As a result, screening based upon cell fluorescence can favorpassengers most efficiently localized to the surface, rather thanpassengers enhanced for the properties of interest (e.g., bindingaffinity).

Thus, there remains a need for additional vectors for bacterial celldisplay and methods that would more effectively display proteins andpeptides.

SUMMARY

The present invention relates to bacterial cell display and methods forenhancing the display of proteins and peptides at the surface ofbacteria by using vectors encoding circularly permuted OmpX (CPX)variants containing optimized linkers and selected mutations atpositions 165 and 166.

In one aspect, the invention includes a circularly permuted OmpX (CPX)variant comprising a linker joining the native N-terminus andC-terminus, wherein the linker is 3-8 residues in length and comprises aglycine and one or more basic amino acids. In one embodiment, the firstresidue of the linker is a glycine. In certain embodiments, the linkercomprises at least two basic residues, for example, at least twoarginine residues or two lysine residues, or at least one arginineresidue and at least one lysine residue. In one embodiment the linker is5 residues in length. In another embodiment, the linker is 6 residues inlength. In certain embodiments, the first residue of the linker is aglycine and the third and sixth residues of the linker are selected fromthe group consisting of arginine, lysine, serine, histidine, glutamine,and asparagine. In certain embodiments, the linker comprises a sequenceselected from the group consisting of SEQ ID NOS:2-27.

The CPX variant of any of the above embodiments may further comprise oneor more mutations that increase the display efficiency of a passengerpeptide carried by the CPX variant. In certain embodiments, ahydrophobic residue is substituted at the position corresponding to A165of the native OmpX protein (numbered relative to the reference sequenceof SEQ ID NO:1). Exemplary mutations include, but are not limited to,A165V, A165L, A165I, A165F. In other embodiments, the amino acid at theposition corresponding to G166 (numbered relative to the referencesequence of SEQ ID NO:1) of the native OmpX protein is replaced.Exemplary mutations include, but are not limited to, G166S and G166A.

In certain embodiments, the CPX variant of any of the above embodiments,further comprises a passenger polypeptide, which can be fused to eitherthe N-terminus or the C-terminus of the CPX variant. In certainembodiments, two passenger polypeptides are carried simultaneously bythe CPX variant, wherein a first passenger polypeptide is fused to theN-terminus and a second passenger polypeptide is fused to the C-terminusof the CPX variant. A passenger polypeptide can be connected to the N-or C-terminus of a CPX variant by a linker sequence, for example, alinker comprising a sequence selected from the group consisting of SEQID NO:28 and SEQ ID NO:34 can be used in the practice of the invention.

In certain embodiments, the CPX variant of the invention carries apassenger polypeptide comprising a detectable label. In one embodiment,the passenger polypeptide is a streptavidin binding peptide that bindsto streptavidin conjugated to a fluorophore. An exemplary streptavidinbinding peptide comprises the amino acid sequence of SEQ ID NO:36. Incertain embodiments, two passenger polypeptides are fused to the CPXvariant at the N-terminus and C-terminus, respectively, wherein bothpassenger polypeptides comprise detectable labels, which may be the sameor different.

In another aspect, the invention includes a polynucleotide encoding anyof the CPX variants described herein.

In another aspect, the invention includes an expression vectorcomprising a polynucleotide of the invention operably linked to apromoter, wherein the expressed CPX variant is capable of displaying oneor more passenger polypeptides on an outer surface of a bacterial cell.

In another aspect, the invention includes a bacterial cell comprising anexpression vector of the invention. Exemplary bacterial cells that canbe used in the practice of the invention include, but are not limitedto, Escherichia coli, Shigella sonnei, Shigella dysenteriae, Shingellaflexneri, Salmonella typhimurium, Salmonella enterica, Enterobacteraerogenes, Serratia marcescens, Yersinia pestis, and Klebsiellapneumoniae.

In another aspect, the invention includes a polypeptide display librarycomprising a polypeptide displayed by a CPX variant of the invention.

In another aspect, the invention includes a method of making apolypeptide display library, the method comprising:

providing a plurality of expression vectors expressing CPX variantscarrying a plurality of passenger polypeptides,

transfecting bacterial cells with the expression vectors, and

culturing the bacterial cells under conditions that permit expression ofthe passenger polypeptides on the surface of the bacterial cells.

In another aspect, the invention includes a method of screening for aCPX variant that displays a passenger polypeptide with greaterefficiency in bacteria than another carrier protein carrying the samepassenger polypeptide, the method comprising:

transfecting a bacterial cell with an expression vector expressing a CPXvariant carrying the passenger polypeptide,

screening for display of the passenger polypeptide at the surface of thebacterial cell within 25 minutes after inducing the expression of theCPX variant carrying the passenger polypeptide; and

comparing the display efficiency of the CPX variant carrying thepassenger polypeptide to the display efficiency of another carrierprotein carrying the same passenger polypeptide expressed under the sameconditions.

The display efficiency of a passenger polypeptide can be increased byscreening different CPX variants by this method. For example, aplurality of CPX variants carrying the same passenger polypeptide arescreened, wherein each CPX variant comprises a different linker joiningthe native N-terminus and C-terminus, wherein the linker is 3-8 residuesin length and comprises a glycine and one or more basic amino acids, andoptionally, a mutation at a position corresponding to A165 or G166 ofthe native OmpX protein (numbered relative to the reference sequence ofSEQ ID NO:1). The CPX variant is selected that displays the passengerpolypeptide with the greatest efficiency compared to the plurality ofother CPX variants.

In another aspect, the invention provides a method of screening alibrary of polypeptides for biological activity in the presence of atarget molecule, the method comprising: a) providing a polypeptidedisplay library comprising CPX variants carrying a plurality ofpassenger polypeptides displayed on bacterial cells, b) contacting theplurality of passenger polypeptides with the target molecule, c)assaying for biological activity in the presence of the target molecule,and d) identifying at least one displayed passenger polypeptide that hasbiological activity. For this purpose, any CPX variant described hereincan be used in the polypeptide display libray for screeningpolypeptides. The polypeptide display library can include passengerpolypeptides fused to the N- or C- or both terminii of the CPX variants.The biological activity assayed can be enzymatic activity, substrateactivity, ligand-binding activity, transport activity, agonist activity,antagonist activity, or any other biological activity. Any targetmolecule can be chosen, including but not limited to, a receptor, aligand, an antibody, an antigen, an enzyme, a transporter, a substrate,an inhibitor, an activator, a cofactor, a drug, a nucleic acid, a lipid,a carbohydrate, a glycoprotein, a small organic molecule, or aninorganic molecule.

In one embodiment, the invention includes a method of screening alibrary of polypeptides for the ability to bind to a target molecule,the method comprising: a) providing a polypeptide display librarycomprising CPX variants carrying a plurality of passenger polypeptidesdisplayed on bacterial cells, b) contacting the plurality of passengerpolypeptides with the target molecule, and c) identifying at least onedisplayed passenger polypeptide that binds to the target molecule. Inone embodiment, the target molecule comprises a detectable label thatenables binding of the target molecule to a passenger polypeptide to bedetermined by detecting the label attached to the target molecule.

Thus, the subject invention is represented by, but not limited to, thefollowing numbered embodiments:

1. A circularly permuted OmpX (CPX) variant comprising a linker joiningthe native N-terminus and C-terminus, wherein the linker is 3-8 residuesin length and comprises a glycine and one or more basic amino acids.

2. The CPX variant of embodiment 1, wherein the first residue of thelinker is a glycine.

3. The CPX variant of embodiment 1 or 2, wherein the linker comprises atleast two basic residues.

4. The CPX variant of any of embodiments 1-3, wherein the linkercomprises two arginine residues.

5. The CPX variant of any of embodiments 1-4, wherein the linkercomprises two lysine residues.

6. The CPX variant of any of embodiments 1-5, wherein the linkercomprises at least one arginine residue and at least one lysine residue.

7. The CPX variant of any of embodiments 1 to 6, wherein the linker is 5residues in length.

8. The CPX variant of any of embodiments 1 to 6, wherein the linker is 6residues in length.

9. The CPX variant of embodiment 8, wherein the first residue of thelinker is a glycine and the third and sixth residues of the linker areselected from the group consisting of arginine, lysine, serine,histidine, glutamine, and asparagine.

10. The CPX variant of any of embodiments 1 to 9, wherein the linkercomprises a sequence selected from the group consisting of SEQ IDNOS:2-27.

11. The CPX variant of any of embodiments 1 to 10, further comprisingone or more mutations that increase the display efficiency of apassenger peptide compared to the CPX variant in the absence of themutations, wherein at least one mutation is at a position correspondingto A165 or G166 of the native OmpX protein numbered relative to thereference sequence of SEQ ID NO:1.

12. The CPX variant of embodiment 11 comprising an A165V mutation.

13. The CPX variant of embodiment 11 comprising an A165L mutation.

14. The CPX variant of embodiment 11 comprising an A1651 mutation.

15. The CPX variant of embodiment 11 comprising an A165F mutation.

16. The CPX variant of any of embodiments 11 to 15 comprising a G166Smutation.

17. The CPX variant of any of embodiments 11 to 15 comprising a G166Amutation.

18. The CPX variant of any of embodiments 1 to 17, further comprising apassenger polypeptide fused to the N-terminus.

19. The CPX variant of embodiment 18, further comprising a linkerbetween the N-terminus and the passenger polypeptide, wherein saidlinker comprises a sequence selected from the group consisting of SEQ IDNO:28 and SEQ ID NO:34.

20. The CPX variant of any of embodiments 1 to 17, further comprising apassenger polypeptide fused to the C-terminus.

21. The CPX variant of embodiment 20, further comprising a linkerbetween the C-terminus and the passenger polypeptide, wherein saidlinker comprises a sequence selected from the group consisting of SEQ IDNO:28 and SEQ ID NO:34.

22. The CPX variant of any of embodiments 1 to 21, further comprising afirst passenger polypeptide fused to the N-terminus and a secondpassenger polypeptide fused to the C-terminus.

23. The CPX variant of embodiment 22, wherein the first passengerpolypeptide or the second passenger polypeptide further comprises adetectable label.

24. The CPX variant of embodiment 23, wherein the first passengerpolypeptide or the second passenger polypeptide comprises a streptavidinbinding peptide.

25. The CPX variant of embodiment 24, wherein the detectable label isstreptavidin conjugated to a fluorophore.

26. The CPX variant of embodiment 24, wherein the streptavidin bindingpeptide comprises the sequence of SEQ ID NO:36.

27. The CPX variant of embodiment 23, wherein both the first passengerpolypeptide and the second passenger polypeptide comprise detectablelabels.

28. The CPX variant of embodiment 27, wherein the first passengerpolypeptide comprises a different detectable label than the secondpassenger polypeptide.

29. The CPX variant of any of embodiments 22 to 28, further comprising alinker between the first passenger polypeptide and the N-terminus or thesecond passenger polypeptide and the C-terminus.

30. The CPX variant of embodiment 29, wherein the linker comprises asequence selected from the group consisting of SEQ ID NO:28 and SEQ IDNO:34.

31. A polynucleotide encoding the CPX variant of any of embodiments 1 to30.

32. An expression vector comprising the polynucleotide of embodiment 31operably linked to a promoter, wherein the expressed CPX variantdisplays one or more passenger polypeptides on an outer surface of abacterial cell.

33. A bacterial cell comprising the expression vector of embodiment 32.

34. The bacterial cell of embodiment 33, where the bacterial cell isEscherichia coli, Shigella sonnei, Shigella dysenteriae, Shingellaflexneri, Salmonella typhimurium, Salmonella enterica, Enterobacteraerogenes, Serratia marcescens, Yersinia pestis, or Klebsiellapneumoniae.

35. A polypeptide display library comprising a polypeptide displayed bythe CPX variant of any of embodiments 1 to 30.

36. A method of making the polypeptide display library of embodiment 35,the method comprising:

-   -   providing a plurality of expression vectors expressing CPX        variants carrying a plurality of passenger polypeptides,    -   transfecting bacterial cells with said expression vectors, and    -   culturing the bacterial cells under conditions that permit        expression of said passenger polypeptides on the surface of the        bacterial cells.

37. A method of screening for a CPX variant that displays a passengerpolypeptide with greater efficiency than another carrier proteincarrying the same passenger polypeptide, the method comprising:

-   -   transfecting a bacterial cell with an expression vector        expressing a CPX variant carrying the passenger polypeptide,    -   screening for display of the passenger polypeptide at the        surface of the bacterial cell within 25 minutes after inducing        the expression of the CPX variant carrying the passenger        polypeptide; and    -   comparing the display efficiency of the CPX variant carrying the        passenger polypeptide to the display efficiency of another        carrier protein carrying the same passenger polypeptide        expressed under the same conditions.

38. A method of enhancing the display efficiency of a passengerpolypeptide, the method comprising:

-   -   screening a plurality of different CPX variants carrying the        same passenger polypeptide according to the method of embodiment        37, wherein each CPX variant comprises a different linker        joining the native N-terminus and C-terminus, wherein the linker        is 3-8 residues in length and comprises a glycine and one or        more basic amino acids, and optionally, a mutation at a position        corresponding to A165 or G166 of the native OmpX protein; and    -   selecting the CPX variant that displays the passenger        polypeptide with the greatest efficiency compared to the        plurality of other CPX variants.

39. The method of embodiment 38, wherein the first residue of the linkeris a glycine and the third residue of the linker is selected from thegroup consisting of arginine, lysine, serine, histidine, glutamine, andasparagine.

40. The method of embodiment 39, wherein the linker is 6 residues inlength.

41. The method of embodiment 40, wherein the sixth residue of the linkeris selected from the group consisting of arginine, lysine, serine,histidine, glutamine, and asparagine.

42. A method of screening a library of polypeptides for the ability tobind to a target molecule, the method comprising:

-   -   a) providing a polypeptide display library comprising CPX        variants carrying a plurality of passenger polypeptides        displayed on bacterial cells,    -   b) contacting the plurality of passenger polypeptides with the        target molecule, and    -   c) identifying at least one displayed passenger polypeptide that        binds to the target molecule.

43. The method of embodiment 42, wherein the target molecule is selectedfrom the group consisting of a receptor, a ligand, an antibody, anantigen, an enzyme, a transporter, a substrate, an inhibitor, anactivator, a cofactor, a drug, a nucleic acid, a lipid, a carbohydrate,a glycoprotein, a small organic molecule, and an inorganic molecule.

44. The method of embodiment 42, wherein said target molecule comprisesa detectable label, wherein identifying the target molecule bound to atleast one passenger polypeptide comprises detecting the label attachedto said target molecule.

45. A method of screening a library of polypeptides for biologicalactivity in the presence of a target molecule, the method comprising:

-   -   a) providing a polypeptide display library comprising CPX        variants carrying a plurality of passenger polypeptides        displayed on bacterial cells,    -   b) contacting the plurality of passenger polypeptides with the        target molecule,    -   c) assaying for biological activity in the presence of the        target molecule, and    -   d) identifying at least one displayed passenger polypeptide that        has biological activity.

46. The method of embodiment 45, wherein the biological activity isenzymatic activity, substrate activity, ligand-binding activity,transport activity, agonist activity, or antagonist activity.

47. The method of embodiment 45, wherein the target molecule is selectedfrom the group consisting of a receptor, a ligand, an antibody, anantigen, an enzyme, a transporter, a substrate, an inhibitor, anactivator, a cofactor, a drug, a nucleic acid, a lipid, a carbohydrate,a glycoprotein, a small organic molecule, and an inorganic molecule.

These and other embodiments of the subject invention will readily occurto those of skill in the art in view of the disclosure herein.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A to 1C, are plots depicting the bacterial display of astreptavidin (SA)-binding peptide with OmpX (FIG. 1A), CPX (FIG. 1B),and eCPX (FIG. 1C). The SA-binding peptide (AECHPQGPPCIEGRK (SEQ IDNO:36), described by Giebel et al. (1995) Biochemistry, 34, 15430-15435)was displayed in E. coli as an insertion in OmpX, an N-terminal fusionin CPX, or as an N-terminal fusion in eCPX. Cells were induced at roomtemperature for time increments between 0 and 90 minutes, then labeledwith 100 nM SA-PE, and analyzed by flow cytometry after varyingdurations of induction.

FIG. 2 depicts a model of the structure of eCPX based on the crystalstructure of OmpX (Vogt, J. & Schulz, G. E. (1999) Structure, 7,1301-1309). The structure shows the native N- and C-termini of OmpXjoined by the six residue linker, GSKSRR (SEQ ID NO:18), the A165L andG166S mutations (shown as space filling residues), and the creation ofnew termini within the second extracellular loop.

FIG. 3 is a graph depicting the display levels of various peptides andmini-proteins using eCPX (shaded) and CPX (white) measured using FACS.The x-axis indicates the fold fluorescence above background for eachprotein target in the corresponding fluorescent channel. P2 was labeledwith mona which is fused to the fluorescent protein YPet. CRP-1 andV114, was labeled with biotinylated CRP and VEGF respectively thenlabeled with SA-PE. Mini-Z and T7-1 were labeled with Alexa conjugatedhuman IgG and anti-T7·tag monoclonal IgG respectively. SApep was labeledwith SA-PE.

FIGS. 4A and 4B, show an overlay of 2-D cytometry data of eCPXdisplaying SApep on the N-terminus and P2 on the C-terminus with a six(FIG. 4A) and a twenty-six (FIG. 4B) residue linker connecting SApep tothe N-terminus. In both plots; negative control (bottom leftpopulation), cells labeled with only SA-PE (top left population), cellslabeled with SA-PE and YPet-mona (top right population), cells labeledwith only YPet (bottom right population). Display using the longerlinker allows for more efficient simultaneous labeling of both termini.

DETAILED DESCRIPTION

The practice of the present invention will employ, unless otherwiseindicated, conventional methods of pharmacology, chemistry,biochemistry, recombinant DNA techniques and immunology, within theskill of the art. Such techniques are explained fully in the literature.See, e.g., Handbook of Experimental Immunology, Vols. I-IV (D. M. Weirand C. C. Blackwell eds., Blackwell Scientific Publications); A. L.Lehninger, Biochemistry (Worth Publishers, Inc., current addition);Sambrook, et al., Molecular Cloning: A Laboratory Manual (2nd Edition,1989); Methods In Enzymology (S. Colowick and N. Kaplan eds., AcademicPress, Inc.).

All publications, patents and patent applications cited herein, whethersupra or infra, are hereby incorporated by reference in theirentireties.

I. Definitions

In describing the present invention, the following terms will beemployed, and are intended to be defined as indicated below.

It must be noted that, as used in this specification and the appendedclaims, the singular forms “a”, “an” and “the” include plural referentsunless the content clearly dictates otherwise. Thus, for example,reference to “a peptide” includes a mixture of two or more suchpeptides, and the like.

The term “CPX” as used herein refers to a circularly permuted variant ofa bacterial outer membrane protein OmpX (see U.S. patent applicationSer. No. 10/920,244, which is herein incorporated by reference in itsentirety). The CPX protein consists of the native OmpX signal sequence,which is cleaved after translocation; a sequence with an embedded SfiIrestriction site (GQSGQ) (SEQ ID NO:35) after which peptides may beinserted; a flexible linking sequence (GGQSGQ) (SEQ ID NO:28); aminoacids S54-F148 of the mature OmpX; a GGSG (SEQ ID NO:2) linker joiningthe native C- and N-termini of OmpX; and amino acids A1-S53 of themature OmpX. CPX can be used as a protein scaffold for bacterial displayof peptides and proteins at the surface of a bacterial cell. Anadvantage of using CPX in bacterial display is that both its N- andC-termini are exterior to the cell, which allows polypeptides to bedisplayed from either terminus or from both termini simultaneously. Theterm CPX includes circularly permuted variants of OmpX from any strainof bacteria, such as Escherichia coli, Shigella sonnei, Shigelladysenteriae, Shingella flexneri, Salmonella typhimurium, Salmonellaenterica, Enterobacter aerogenes, Serratia marcescens, Yersinia pestis,or Klebsiella pneumoniae. The GenBank database contains completesequences for OmpX proteins from a variety of bacterial isolates, whichcould be used to produce CPX proteins of the invention. Furthermore, forpurposes of the present invention, the term “CPX” refers to a proteinwhich includes modifications, such as deletions, additions andsubstitutions, for example, replacement of the linker joining the nativeN- and C-termini of OmpX, substitutions at positions 165 and 166(numbered with reference to the sequence of native OmpX from Escherichiacoli, SEQ ID NO:1), incorporation of alternate restriction sites afterwhich polypeptides or peptides may be inserted, or the addition oflinkers between the N-terminus or C-terminus of CPX and a passengerpolypeptide, so long as the protein maintains biological activity (i.e.,ability to efficiently display polypeptides). These modifications may bedeliberate, as through site-directed mutagenesis, or may be accidental,such as through mutations of hosts which produce the proteins or errorsdue to PCR amplification.

The terms “polypeptide”, “peptide”, “protein”, and “amino acid sequence”as used herein generally refer to any compound comprising naturallyoccurring or synthetic amino acid polymers or amino acid-like moleculesincluding but not limited to compounds comprising amino and/or iminomolecules. No particular size is implied by use of the term “peptide”,“oligopeptide”, “polypeptide”, or “protein” and these terms are usedinterchangeably. Included within the definition are, for example,polypeptides containing one or more analogs of an amino acid (including,for example, unnatural amino acids, etc.), polypeptides with substitutedlinkages, as well as other modifications known in the art, bothnaturally occurring and non-naturally occurring (e.g., synthetic). Thus,synthetic oligopeptides, dimers, multimers (e.g., tandem repeats,multiple antigenic peptide (MAP) forms, linearly-linked peptides),cyclized, branched molecules and the like, are included within thedefinition. The terms also include molecules comprising one or morepeptoids (e.g., N-substituted glycine residues) and other syntheticamino acids or peptides. (See, e.g., U.S. Pat. Nos. 5,831,005;5,877,278; and 5,977,301; Nguyen et al. (2000) Chem Biol. 7(7):463-473;and Simon et al. (1992) Proc. Natl. Acad. Sci. USA 89(20):9367-9371 fordescriptions of peptoids). Non-limiting lengths of peptides suitable foruse in the present invention includes peptides of 3 to 5 residues inlength, 6 to 10 residues in length (or any integer therebetween), 11 to20 residues in length (or any integer therebetween), 21 to 75 residuesin length (or any integer therebetween), 75 to 100 (or any integertherebetween), or polypeptides of greater than 100 residues in length.Typically, polypeptides useful in this invention can have a maximumlength suitable for the intended application. Preferably, thepolypeptide is between about 3 and 100 residues in length. Generally,one skilled in art can easily select the maximum length in view of theteachings herein. Further, peptides as described herein, for examplesynthetic peptides, may include additional molecules such as labels orother chemical moieties (e.g., streptavidin conjugated to phycoerythrin,Alexa dye conjugated to anti-T7 tag). Such moieties may further enhanceinteraction of the peptides with a ligand and/or further detection ofpolypeptide display.

Thus, reference to peptides also includes derivatives of the amino acidsequences of the invention including one or more non-naturally occurringamino acid. A first polypeptide is “derived from” a second polypeptideif it is (i) encoded by a first polynucleotide derived from a secondpolynucleotide encoding the second polypeptide, or (ii) displayssequence identity to the second polypeptide as described herein.Sequence (or percent) identity can be determined as described below.Preferably, derivatives exhibit at least about 50% percent identity,more preferably at least about 80%, and even more preferably betweenabout 85% and 99% (or any value therebetween) to the sequence from whichthey were derived. Such derivatives can include postexpressionmodifications of the polypeptide, for example, glycosylation,acetylation, phosphorylation, and the like.

Amino acid derivatives can also include modifications to the nativesequence, such as deletions, additions and substitutions (generallyconservative in nature), so long as the polypeptide maintains thedesired activity. These modifications may be deliberate, as throughsite-directed mutagenesis, or may be accidental, such as throughmutations of hosts that produce the proteins or errors due to PCRamplification. Furthermore, modifications may be made that have one ormore of the following effects: increasing efficiency of bacterialdisplay, level of expression, or stability of the polypeptide.Polypeptides described herein can be made recombinantly, synthetically,or in tissue culture.

A CPX polypeptide or protein molecule, as defined above, is a circularlypermuted variant of a bacterial outer membrane protein OmpX derived frombacteria, including, but not limited to Escherichia coli, Shigellasonnei, Shigella dysenteriae, Shingella flexneri, Salmonellatyphimurium, Salmonella enterica, Enterobacter aerogenes, Serratiamarcescens, Yersinia pestis, or Klebsiella pneumoniae. The molecule neednot be physically derived from the particular isolate in question, butmay be synthetically or recombinantly produced.

The amino acid sequences of a number of OmpX proteins are known.Representative sequences from bacteria are listed in the National Centerfor Biotechnology Information (NCBI) database. See, for example, NCBIentries: Escherichia coli OmpX, Accession No. P0A917; Serratiamarcescens OmpX, Accession No. AAS78634; Salmonella enterica subsp.enterica serovar Choleraesuis str. SC-B67 ail and ompX homolog,Accession No. YP_219185; Salmonella enterica subsp. enterica serovarTyphi OmpX precursor, Accession No. CAD05280; Enterobacter cloacae OmpX,Accession No. P25253; Yersinia pseudotuberculosis IP 32953 OmpX,Accession No. YP_071052; Yersinia pseudotuberculosis IP 32953 OmpX,Accession No. YP_071052; Shigella flexneri OmpX precursor, Accession No.P0A920; Escherichia coli OmpX precursor, Accession No. P0A918;Escherichia coli OmpX precursor, Accession No. P0A919; Salmonellaenterica subsp. enterica serovar Typhi Ty2 OmpX, Accession No.NP_805818; Shigella flexneri 2a str. 301 OmpX, Accession No. NP_706692;Yersinia pestis KIM OmpX, Accession No. NP_669000; Salmonella entericasubsp. enterica serovar Typhi str. CT18 OmpX, Accession No. NP_455368;Salmonella typhimurium LT2 OmpX, Accession No. NP_459810; Escherichiacoli O157:H7 str. Sakai OmpX, Accession No. NP_308919; Escherichia coliO157:H7 EDL933 OmpX, Accession No. NP_286578; Shigella flexneri 2a str.2457T OmpX, Accession No. NP_836469; Salmonella enterica subsp. entericaserovar Choleraesuis str. SC-B67 OmpX, Accession No. YP_215816; Yersiniapestis CO92 OmpX, Accession No. NP_406040; Yersinia pestis biovarMicrotus str. 91001 OmpX, Accession No. NP_993650; Escherichia coliCFT073 OmpX, Accession No. NP_752830; Salmonella enterica subsp.enterica serovar Paratyphi A str. ATCC 9150 OmpX, Accession No.YP_151143; Erwinia carotovora subsp. atroseptica SCRI1043 OmpX,Accession No. YP_050855; Erwinia carotovora subsp. atroseptica SCRI1043OmpX, Accession No. YP_050855; Escherichia coli APEC O1 OmpX precursor,Accession No. ABJ00194; Shigella boydii Sb227 OmpX, Accession No.YP_407207; Escherichia coli UTI89 OmpX, Accession No. ABE06304; Yersiniapestis KIM OmpX, Accession No. NP_669349; Yersinia pestis KIM OmpX,Accession No. NP_668646; Escherichia coli O157:H7 EDL933 OmpX, AccessionNo. AAG55186; Shigella flexneri 2a str. 2457T OmpX, Accession No.AAP16275; Escherichia coli APEC O1 OmpX precursor, Accession No.YP_851908; Escherichia coli UTI89 OmpX, Accession No. YP_539835; andShigella sonnei Ss046 OmpX, Accession No. YP_309776; all of whichsequences (as entered by the date of filing of this application) areherein incorporated by reference.

The term “passenger” polypeptide refers to a polypeptide linked to theN- or C-terminus of CPX or a variant thereof for display at the surfaceof a bacterial cell. Preferably, a passenger polypeptide is capable ofinteracting physically with arbitrary compositions of matter (biologicalor non-biological), and exhibits a biological activity (e.g., affinity,specificity, catalysis, assembly etc.) substantially similar to thecorresponding free polypeptide in solution. In other words, thedisplayed passenger polypeptide interacts with or binds a given targetmolecule in a manner that is substantially similar to that when thepolypeptide is in its native environment and not attached to CPX or avariant thereof.

As used herein, the term “ligand” refers to a molecule that binds toanother molecule, e.g., an antigen binding to an antibody, a hormone orneurotransmitter binding to a receptor, or a substrate or allostericeffector binding to an enzyme and includes natural and syntheticbiomolecules, such as proteins, polypeptides, peptides, nucleic acidmolecules, carbohydrates, sugars, lipids, lipoproteins, small molecules,natural and synthetic organic and inorganic materials, syntheticpolymers, and the like.

The term “polynucleotide”, as known in the art, generally refers to anucleic acid molecule. A “polynucleotide” can include both double- andsingle-stranded sequences and refers to, but is not limited to,prokaryotic sequences, eukaryotic mRNA, cDNA from viral, prokaryotic oreukaryotic mRNA, genomic RNA and DNA sequences from viral (e.g. RNA andDNA viruses and retroviruses), prokaryotic DNA or eukaryotic (e.g.,mammalian) DNA, and especially synthetic DNA sequences. The term alsocaptures sequences that include any of the known base analogs of DNA andRNA, and includes modifications such as deletions, additions andsubstitutions (generally conservative in nature), to the nativesequence. These modifications may be deliberate, as throughsite-directed mutagenesis, or may be accidental, such as throughmutations of hosts including polynucleotides encoding CPX or a variantthereof. Modifications of polynucleotides may have any number of effectsincluding, for example, facilitating expression/bacterial display of thepolypeptide product at the surface of a host cell.

A polynucleotide can encode a biologically active (e.g., CPX or avariant thereof) protein or polypeptide. Depending on the nature of thepolypeptide encoded by the polynucleotide, a polynucleotide can includeas little as 10 nucleotides, e.g., where the polynucleotide encodes alinker, tag or label, or an antigen or epitope for bacterial display.Typically, the polynucleotide encodes peptides of at least 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30 or even moreamino acids.

“Recombinant” as used herein to describe a nucleic acid molecule means apolynucleotide of genomic, cDNA, viral, semisynthetic, or syntheticorigin which, by virtue of its origin or manipulation is not associatedwith all or a portion of the polynucleotide with which it is associatedin nature. The term “recombinant” as used with respect to a protein,polypeptide, or peptide means a polypeptide produced by expression of arecombinant polynucleotide. In general, the gene of interest is clonedand then expressed in transformed organisms, as described further below.The host organism expresses the foreign gene to produce the proteinunder expression conditions.

A “polynucleotide coding sequence” or a sequence that “encodes” aselected polypeptide, is a nucleic acid molecule that is transcribed (inthe case of DNA) and translated (in the case of mRNA) into a polypeptidein vivo when placed under the control of appropriate regulatorysequences (or “control elements”). The boundaries of the coding sequenceare determined by a start codon at the 5′ (amino) terminus and atranslation stop codon at the 3′ (carboxy) terminus. A transcriptiontermination sequence may be located 3′ to the coding sequence. Typical“control elements,” include, but are not limited to, transcriptionregulators, such as promoters, transcription enhancer elements,transcription termination signals, and polyadenylation sequences; andtranslation regulators, such as sequences for optimization of initiationof translation, e.g., Shine-Dalgarno (ribosome binding site) sequences,Kozak sequences (i.e., sequences for the optimization of translation,located, for example, 5′ to the coding sequence), leader sequences(heterologous or native), translation initiation codon (e.g., ATG), andtranslation termination sequences. Promoters can include induciblepromoters (where expression of a polynucleotide sequence operably linkedto the promoter is induced by an analyte, cofactor, regulatory protein,etc.), repressible promoters (where expression of a polynucleotidesequence operably linked to the promoter is induced by an analyte,cofactor, regulatory protein, etc.), and constitutive promoters.

“Operably linked” refers to an arrangement of elements wherein thecomponents so described are configured so as to perform their usualfunction. Thus, a given promoter operably linked to a coding sequence iscapable of effecting the expression of the coding sequence when theproper enzymes are present. The promoter need not be contiguous with thecoding sequence, so long as it functions to direct the expressionthereof. Thus, for example, intervening untranslated yet transcribedsequences can be present between the promoter sequence and the codingsequence and the promoter sequence can still be considered “operablylinked” to the coding sequence.

By “isolated” is meant, when referring to a polynucleotide or apolypeptide, that the indicated molecule is separate and discrete fromthe whole organism with which the molecule is found in nature or, whenthe polynucleotide or polypeptide is not found in nature, issufficiently free of other biological macromolecules so that thepolynucleotide or polypeptide can be used for its intended purpose.

The terms “label” and “detectable label” refer to a molecule capable ofdetection, including, but not limited to, radioactive isotopes,fluorescers, chemiluminescers, enzymes, enzyme substrates, enzymecofactors, enzyme inhibitors, chromophores, dyes, metal ions, metalsols, ligands (e.g., biotin or haptens) and the like. The term“fluorescer” refers to a substance or a portion thereof that is capableof exhibiting fluorescence in the detectable range. Particular examplesof labels that may be used with the invention include, but are notlimited to phycoerythrin, Alexa dyes, fluorescein, YPet, CyPet, Cascadeblue, allophycocyanin, Cy3, Cy5, Cy7, rhodamine, dansyl, umbelliferone,Texas red, luminol, acradimum esters, biotin, green fluorescent protein(GFP), enhanced green fluorescent protein (EGFP), yellow fluorescentprotein (YFP), enhanced yellow fluorescent protein (EYFP), bluefluorescent protein (BFP), red fluorescent protein (RFP), fireflyluciferase, Renilla luciferase, NADPH, beta-galactosidase, horseradishperoxidase, glucose oxidase, alkaline phosphatase, chloramphenicalacetyl transferase, and urease.

The term “derived from” is used herein to identify the original sourceof a molecule but is not meant to limit the method by which the moleculeis made which can be, for example, by chemical synthesis or recombinantmeans.

The terms “variant,” “analog” and “mutein” refer to biologically activederivatives of the reference molecule that retain desired activity(e.g., efficient polypeptide display) as described herein. In general,the terms “variant” and “analog” refer to compounds having a nativepolypeptide sequence and structure with one or more amino acidadditions, substitutions and/or deletions (e.g., in the linker joiningnative N- and C-terminii or at positions 165 and 166), relative to thenative molecule, so long as the modifications do not destroy biologicalactivity and which are “substantially homologous” to the referencemolecule as defined below. In general, the amino acid sequences of suchanalogs will have a high degree of sequence homology to the referencesequence, e.g., amino acid sequence homology of more than 50%, generallymore than 60%-70%, even more particularly 80%-85% or more, such as atleast 90%-95% or more, when the two sequences are aligned. Often, theanalogs will include the same number of amino acids but will includesubstitutions, as explained herein. The term “mutein” further includespolypeptides having one or more amino acid-like molecules including butnot limited to compounds comprising only amino and/or imino molecules,polypeptides containing one or more analogs of an amino acid (including,for example, unnatural amino acids, etc.), polypeptides with substitutedlinkages, as well as other modifications known in the art, bothnaturally occurring and non-naturally occurring (e.g., synthetic),cyclized, branched molecules and the like. The term also includesmolecules comprising one or more N-substituted glycine residues (a“peptoid”) and other synthetic amino acids or peptides. (See, e.g., U.S.Pat. Nos. 5,831,005; 5,877,278; and 5,977,301; Nguyen et al., Chem Biol.(2000) 7:463-473; and Simon et al., Proc. Natl. Acad. Sci. USA (1992)89:9367-9371 for descriptions of peptoids). Preferably, the analog ormutein has at least the same polypeptide display efficiency as thenative OmpX molecule. Methods for making polypeptide analogs and muteinsare known in the art and are described further below.

Analogs generally include substitutions that are conservative in nature,i.e., those substitutions that take place within a family of amino acidsthat are related in their side chains. Specifically, amino acids aregenerally divided into four families: (1) acidic—aspartate andglutamate; (2) basic—lysine, arginine, histidine; (3) non-polar—alanine,valine, leucine, isoleucine, proline, phenylalanine, methionine,tryptophan; and (4) uncharged polar—glycine, asparagine, glutamine,cysteine, serine threonine, tyrosine. Phenylalanine, tryptophan, andtyrosine are sometimes classified as aromatic amino acids. For example,it is reasonably predictable that an isolated replacement of leucinewith isoleucine or valine, an aspartate with a glutamate, a threoninewith a serine, or a similar conservative replacement of an amino acidwith a structurally related amino acid, will not have a major effect onthe biological activity. For example, the polypeptide of interest mayinclude up to about 5-10 conservative or non-conservative amino acidsubstitutions, or even up to about 15-25 conservative ornon-conservative amino acid substitutions, or any integer between 5-25,so long as the desired function of the molecule remains intact. One ofskill in the art may readily determine regions of the molecule ofinterest that can tolerate change by reference to Hopp/Woods andKyte-Doolittle plots, well known in the art.

By “derivative” is intended any suitable modification of the nativepolypeptide of interest, of a fragment of the native polypeptide, or oftheir respective analogs, such as glycosylation, phosphorylation,polymer conjugation (such as with polyethylene glycol), or otheraddition of foreign moieties, so long as the desired biological activityof the native polypeptide is retained. Methods for making polypeptidefragments, analogs, and derivatives are generally available in the art.

By “fragment” is intended a molecule consisting of only a part of theintact full-length sequence and structure. The fragment can include aC-terminal deletion an N-terminal deletion, and/or an internal deletionof the peptide. Active fragments of a particular protein or peptide willgenerally include at least about 5-10 contiguous amino acid residues ofthe full-length molecule, preferably at least about 15-25 contiguousamino acid residues of the full-length molecule, and most preferably atleast about 20-50 or more contiguous amino acid residues of thefull-length molecule, or any integer between 5 amino acids and thefull-length sequence, provided that the fragment in question retainsbiological activity, such as ligand-binding activity, as defined herein.

“Substantially purified” generally refers to isolation of a substance(compound, polynucleotide, protein, polypeptide, polypeptidecomposition) such that the substance comprises the majority percent ofthe sample in which it resides. Typically in a sample a substantiallypurified component comprises 50%, preferably 80%-85%, more preferably90-95% of the sample. Techniques for purifying polynucleotides andpolypeptides of interest are well-known in the art and include, forexample, ion-exchange chromatography, affinity chromatography andsedimentation according to density.

By “isolated” is meant, when referring to a polypeptide, that theindicated molecule is separate and discrete from the whole organism withwhich the molecule is found in nature or is present in the substantialabsence of other biological macro-molecules of the same type. The term“isolated” with respect to a polynucleotide is a nucleic acid moleculedevoid, in whole or part, of sequences normally associated with it innature; or a sequence, as it exists in nature, but having heterologoussequences in association therewith; or a molecule disassociated from thechromosome.

“Homology” refers to the percent identity between two polynucleotide ortwo polypeptide moieties. Two nucleic acid, or two polypeptide sequencesare “substantially homologous” to each other when the sequences exhibitat least about 50% , preferably at least about 75%, more preferably atleast about 80%-85%, preferably at least about 90%, and most preferablyat least about 95%-98% sequence identity over a defined length of themolecules. As used herein, substantially homologous also refers tosequences showing complete identity to the specified sequence.

In general, “identity” refers to an exact nucleotide-to-nucleotide oramino acid-to-amino acid correspondence of two polynucleotides orpolypeptide sequences, respectively. Percent identity can be determinedby a direct comparison of the sequence information between two molecules(the reference sequence and a sequence with unknown % identity to thereference sequence) by aligning the sequences, counting the exact numberof matches between the two aligned sequences, dividing by the length ofthe reference sequence, and multiplying the result by 100. Readilyavailable computer programs can be used to aid in the analysis, such asALIGN, Dayhoff, M. O. in Atlas of Protein Sequence and Structure M. O.Dayhoff ed., 5 Suppl. 3:353-358, National biomedical ResearchFoundation, Washington, DC, which adapts the local homology algorithm ofSmith and Waterman Advances in Appl. Math. 2:482-489, 1981 for peptideanalysis. Programs for determining nucleotide sequence identity areavailable in the Wisconsin Sequence Analysis Package, Version 8(available from Genetics Computer Group, Madison, Wis.) for example, theBESTFIT, FASTA and GAP programs, which also rely on the Smith andWaterman algorithm. These programs are readily utilized with the defaultparameters recommended by the manufacturer and described in theWisconsin Sequence Analysis Package referred to above. For example,percent identity of a particular nucleotide sequence to a referencesequence can be determined using the homology algorithm of Smith andWaterman with a default scoring table and a gap penalty of sixnucleotide positions.

Another method of establishing percent identity in the context of thepresent invention is to use the MPSRCH package of programs copyrightedby the University of Edinburgh, developed by John F. Collins and ShaneS. Sturrok, and distributed by IntelliGenetics, Inc. (Mountain View,Calif.). From this suite of packages the Smith-Waterman algorithm can beemployed where default parameters are used for the scoring table (forexample, gap open penalty of 12, gap extension penalty of one, and a gapof six). From the data generated the “Match” value reflects “sequenceidentity.” Other suitable programs for calculating the percent identityor similarity between sequences are generally known in the art, forexample, another alignment program is BLAST, used with defaultparameters. For example, BLASTN and BLASTP can be used using thefollowing default parameters: genetic code=standard; filter=none;strand=both; cutoff=60; expect=10; Matrix=BLOSUM62; Descriptions=50sequences; sort by=HIGH SCORE; Databases=non-redundant,GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+Swissprotein+Spupdate+PIR. Details of these programs are readily available.

Alternatively, homology can be determined by hybridization ofpolynucleotides under conditions which form stable duplexes betweenhomologous regions, followed by digestion with single-stranded-specificnuclease(s), and size determination of the digested fragments. DNAsequences that are substantially homologous can be identified in aSouthern hybridization experiment under, for example, stringentconditions, as defined for that particular system. Defining appropriatehybridization conditions is within the skill of the art. See, e.g.,Sambrook et al., supra; DNA Cloning, supra; Nucleic Acid Hybridization,supra.

II. Modes of Carrying Out the Invention

Before describing the present invention in detail, it is to beunderstood that this invention is not limited to particular formulationsor process parameters as such may, of course, vary. It is also to beunderstood that the terminology used herein is for the purpose ofdescribing particular embodiments of the invention only, and is notintended to be limiting.

Although a number of methods and materials similar or equivalent tothose described herein can be used in the practice of the presentinvention, the preferred materials and methods are described herein.

The purpose of “cell surface display” systems is to present polypeptideson living cells to extracellular targets of any size and molecularcomposition. The application of bacterial display technology to a broadrange of protein engineering applications, however, has been hindered bythe absence of robust, validated display scaffolds. The presentinvention is based on the discovery of novel CPX variants with enhancedproperties for use in bacterial display.

The construction of a circularly permuted outer membrane protein OmpX(CPX) for use as a protein scaffold for polypeptide display wasdescribed earlier (see U.S. patent application Ser. No. 10/920,244,which is herein incorporated by reference in its entirety). The originalCPX protein had the unique characteristic that both C- and N-termini ofthe scaffold were localized on the bacterial cell surface and availablefor display of polypeptides and peptides. The CPX protein scaffoldconsisted of the native OmpX signal sequence, which is cleaved aftertranslocation; a sequence with an embedded SfiI restriction site (GQSGQ)(SEQ ID NO: 35) after which peptides may be inserted; a flexible linkingsequence (GGQSGQ) (SEQ ID NO:28); amino acids S54-F148 of the matureOmpX; a GGSG (SEQ ID NO:2) linker joining the native C- and N-termini;and amino acids A1-S53 of the mature OmpX. This previously described CPXprotein unfortunately exhibited reduced surface localization compared toOmpX, which interfered with the presentation of large peptides and thedisplay of two unique peptides simultaneously from structurally adjacenttermini.

As described in Experimental Examples 1-4, semi-rational design anddirected evolution were used to create circularly permuted outermembrane protein variants also presenting both the N- and C-termini, butshowing significantly enhanced display of a diverse group of peptides,microproteins, and repeat proteins compared to CPX. In order to identifyCPX scaffold variants with increased display efficiency, libraries ofCPX variants were constructed and screened for optimal linker sequencesjoining the native N- and C-termini of OmpX and for fortuitous mutationsthat more efficiently display peptides. More generally, this approachprovides a potential route to enhance the performance of a variety ofcell surface display scaffolds in presenting passenger proteins. Thus,the methods described herein can be used to make library screens moreefficient and less biased towards peptides that are difficult todisplay.

In order to further an understanding of the invention, a more detaileddiscussion is provided below regarding the construction of CPX variantshaving enhanced display properties and their use in bacterial displayapplications.

A. Circularly Permuted OmpX Variants

Circularly permuted variants, as described herein, can be constructedfor any bacterial outer membrane protein OmpX. Representative OmpXsequences from various species of bacteria are known and listed herein.Thus, circulated permuted variants can be derived from any bacterialstrain or isolate, including, but not limited to Escherichia coli,Shigella sonnei, Shigella dysenteriae, Shingella flexneri, Salmonellatyphimurium, Salmonella enterica, Enterobacter aerogenes, Serratiamarcescens, Yersinia pestis, or Klebsiella pneumoniae. RepresentativeOmpX sequences from bacteria include: Escherichia coli OmpX, AccessionNo. P0A917; Serratia marcescens OmpX, Accession No. AAS78634; Salmonellaenterica subsp. enterica serovar Choleraesuis str. SC-B67 ail and ompXhomolog, Accession No. YP_219185; Salmonella enterica subsp. entericaserovar Typhi OmpX precursor, Accession No. CAD05280; Enterobactercloacae OmpX, Accession No. P25253; Yersinia pseudotuberculosis IP 32953OmpX, Accession No. YP_071052; Yersinia pseudotuberculosis IP 32953OmpX, Accession No. YP_071052; Shigella flexneri OmpX precursor,Accession No. P0A920; Escherichia coli OmpX precursor, Accession No.P0A918; Escherichia coli OmpX precursor, Accession No. P0A919;Salmonella enterica subsp. enterica serovar Typhi Ty2 OmpX, AccessionNo. NP_805818; Shigella flexneri 2a str. 301 OmpX, Accession No.NP_706692; Yersinia pestis KIM OmpX, Accession No. NP_669000; Salmonellaenterica subsp. enterica serovar Typhi str. CT18 OmpX, Accession No.NP_455368; Salmonella typhimurium LT2 OmpX, Accession No. NP_459810;Escherichia coli O157:H7 str. Sakai OmpX, Accession No. NP_308919;Escherichia coli O157:H7 EDL933 OmpX, Accession No. NP_286578; Shigellaflexneri 2a str. 2457T OmpX, Accession No. NP_836469; Salmonellaenterica subsp. enterica serovar Choleraesuis str. SC-B67 OmpX,Accession No. YP_215816; Yersinia pestis CO92 OmpX, Accession No.NP_406040; Yersinia pestis biovar Microtus str. 91001 OmpX, AccessionNo. NP_993650; Escherichia coli CFT073 OmpX, Accession No. NP_752830;Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC 9150OmpX, Accession No. YP_151143; Erwinia carotovora subsp. atrosepticaSCRI1043 OmpX, Accession No. YP_050855; Erwinia carotovora subsp.atroseptica SCRI1043 OmpX, Accession No. YP_050855; Escherichia coliAPEC O1 OmpX precursor, Accession No. ABJ00194; Shigella boydii Sb227OmpX, Accession No. YP_407207; Escherichia coli UTI89 OmpX, AccessionNo. ABE06304; Yersinia pestis KIM OmpX, Accession No. NP_669349;Yersinia pestis KIM OmpX, Accession No. NP_668646; Escherichia coliO157:H7 EDL933 OmpX, Accession No. AAG55186; Shigella flexneri 2a str.2457T OmpX, Accession No. AAP16275; Escherichia coli APEC O1 OmpXprecursor, Accession No. YP_851908; Escherichia coli UTI89 OmpX,Accession No. YP_539835; and Shigella sonnei Ss046 OmpX, Accession No.YP_309776; all of which sequences (as entered by the date of filing ofthis application) are herein incorporated by reference. Any of thesesequences or a variant thereof comprising a sequence having at leastabout 80-100% sequence identity thereto, including any percent identitywithin this range, such as 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91,92, 93, 94, 95, 96, 97, 98, or 99% sequence identity thereto, can beused to construct a CPX variant, as described herein.

Bacterial display can be used in combination with magnetic-activatedcell sorting (MACS) and fluorescence-activated cell sorting (FACS)techniques for quantitative library analysis and screening for CPXvariants that display polypeptides or peptides efficiently (see, e.g.,Examples 1-4 and Rice et al. (2006) Protein Sci. 15:825-836; U.S. PatentApplication Publication No. 2005/0196406; Daugherty et al. (2000) J.Immuunol. Methods 243(1-2):211-2716; Georgiou (2000) Adv. Protein Chem.55:293-315; Daugherty et al. (2000) Proc. Natl. Acad. Sci. U.S.A.97(5):2029-3418; Olsen et al. (2003) Methods Mol. Biol. 230:329-342; andBoder et al. (2000) Proc. Natl. Acad. Sci. U.S.A. 97(20):10701-10705;herein incorporated by reference in their entireties). Analysis of thedisplay efficiency of a CPX variant is facilitated by the use of apassenger polypeptide comprising a label (e.g., phycoerythrin, Alexadye, fluorescein, YPet, CyPet) that allows detection of the displayedpolypeptide at the bacterial cell surface.

The sequences of exemplary CPX variants for use in bacterial display aredescribed in the table below:

Positions Clone 165 166 Linker CPX A G GGSG (SEQ ID NO: 2) CPX-3X-1 A GGRK (SEQ ID NO: 3) CPX-3X-2 A G GRK (SEQ ID NO: 3) CPX-3X-3 A G GTK(SEQ ID NO: 4) CPX-3X-4 A G GKK (SEQ ID NO: 5) CPX-4X-1 A G GSKR(SEQ ID NO: 6) CPX-4X-2 A G GRQK (SEQ ID NO: 7) CPX-4X-3 A G SWPN(SEQ ID NO: 8) CPX-4X-4 V G PRKS (SEQ ID NO: 9) CPX-5X-1 A G GRTRK(SEQ ID NO: 10) CPX-5X-2 A G GRKRN (SEQ ID NO: 11) CPX-5X-3 V G GATRR(SEQ ID NO: 12) CPX-5X-4 A S GSQSK (SEQ ID NO: 13) CPX-6X-1 A G GTKRYH(SEQ ID NO: 14) CPX-6X-2 A G GRRHYK (SEQ ID NO: 15) CPX-6X-3 A G GNRRHR(SEQ ID NO: 16) CPX-6X-4 A S GSKQSK (SEQ ID NO: 17) CPX-L2-1 L S GSKSRR(SEQ ID NO: 18) CPX-L2-2 F S GRKNSH (SEQ ID NO: 19) CPX-L2-3 I S GTRGSQ(SEQ ID NO: 20) CPX-L2-4 L S GHRSHR (SEQ ID NO: 21) CPX-L2-5 I S GDRKRR(SEQ ID NO: 22) CPX-L2-6 V A GARGRH (SEQ ID NO: 23) CPX-L2-7 V S GTHNSQ(SEQ ID NO: 24) CPX-L2-8 V S GPNKSR (SEQ ID NO: 25) CPX-L2-9 I S GPHNSR(SEQ ID NO: 26) CPX-L2-10 I S HRGYHAQR (SEQ ID NO: 27)

As shown in Examples 1-4, CPX variants that efficiently displayedpolypeptides were identified by screening libraries of OmpX polypeptidescontaining different linkers between the native N- and C-terminii. Inorder to identify CPX scaffold variants with optimal linker sequencesjoining the native C- and N-termini, four separate libraries with three,four, five or six random linker amino acids were screened using MACS andFACS. CPX variants as described herein revealed a preference for longerlinkers of five to six residues, a consensus for glycine at the firstposition of the linker, and an abundance of basic residues in theremaining positions.

Thus, in one aspect, the invention includes a CPX variant comprising alinker joining the native N-terminus and C-terminus, wherein the linkeris 3-8 residues in length and comprises a glycine and one or more basicamino acids. In a preferred embodiment, the first residue of the linkeris a glycine. In certain embodiments, the linker comprises at least twobasic residues, for example, at least two arginine residues or twolysine residues, or at least one arginine residue and at least onelysine residue. In preferred embodiments the linker is 5 or 6 residuesin length. In one embodiment, the linker is 6 residues in length and thefirst residue of the linker is a glycine and the third and sixthresidues of the linker are selected from the group consisting ofarginine, lysine, serine, histidine, glutamine, and asparagine. Incertain embodiments, the linker comprises a sequence selected from thegroup consisting of SEQ ID NOS:2-27.

In addition, CPX variants can be screened for fortuitous mutations thatenhance the display efficiency of a passenger polypeptide. As shown inExample 2, substitutions at positions 165 and 166 (numbered relative tothe reference sequence of SEQ ID NO:1) near the native C-terminus ofOmpX greatly increased display levels of polypeptides. Thus, the CPXvariants described herein may comprise one or more mutations thatincrease the display efficiency of a passenger polypeptide. In certainembodiments, a hydrophobic residue is substituted at the positioncorresponding to A165 of the native OmpX protein (numbered relative tothe reference sequence of SEQ ID NO:1), for example, a valine, leucine,isoleucine, or phenylalanine. In another embodiment, the amino acid atthe position corresponding to G166 (numbered relative to the referencesequence of SEQ ID NO:1) of the native OmpX protein is replaced, forexample, with a serine or alanine. In a preferred embodiment, the CPXvariant comprises the mutations A165L and G166S and a linker consistingof the sequence of GSKSRR (SEQ ID NO:18).

A CPX variant can display a single passenger polypeptide on either theN-terminus or the C-terminus. Alternatively, a CPX variant can displaytwo passenger polypeptides simultaneously on both the N- and C-termini.Preferably, a passenger polypeptide is capable of interacting physicallywith arbitrary compositions of matter (biological or non-biological),and exhibits a biological activity (e.g., affinity, specificity,catalysis, assembly etc.) substantially similar to the correspondingfree polypeptide in solution. In other words, the displayed passengerpolypeptide interacts with or binds a given target molecule in a mannerthat is substantially similar to that when the polypeptide is in itsnative environment and not attached to the CPX protein or a variantthereof.

Biterminal display has numerous advantages, including the ability toquantify the amount of the CPX variant displayed on the cell surface andto screen libraries on both termini simultaneously. For this purpose, aCPX variant can be loaded with a single labeled passenger polypeptide ortwo differently labeled passenger polypeptides in order to allowdetection of surface display. The quantification of the display levelduring library screening by labeling of a passenger polypeptide allowsfor polypeptides with a high affinity but low display level to bedifferentiated from polypeptides with a high display level but moderateaffinity. Moreover, biterminal display allows for the possibility ofcreating peptide libraries on each terminus where both peptides can bindto separate regions of the same protein target, causing increasedbinding affinity and specificity through avidity.

Additionally, linkers may be inserted between a passenger polypeptide ofinterest and either the N- or C-terminus of the CPX variant to which itis connected in order to avoid steric hindrance between simultaneouslydisplayed passenger polypeptides and/or their binding partners. Forexample, a long flexible linker comprising multiple repeats of thesequence GGGS (SEQ ID NO:37) (e.g., (GGGS)₄ (SEQ ID NO:38), (GGGS)₅ (SEQID NO:34), or (GGGS)₆ (SEQ ID NO:39)) can be used to increase theaccessibility of proteins to ligands and to avoid steric hindrance whenusing biterminal display.

Polynucleotides Encoding CPX Variants and Library Construction

Polynucleotides encoding CPX variants of the present invention can beproduced in any number of ways, all of which are well known in the art.

In one embodiment, the polynucleotides are generated using recombinanttechniques, well known in the art. One of skill in the art could readilydetermining nucleotide sequences that encode the desired CPX variantsusing standard methodology and the teachings herein.

Oligonucleotide probes can be devised based on the known sequences ofOmpX proteins and used to probe genomic or cDNA libraries. The sequencescan then be further isolated using standard techniques and, e.g.,restriction enzymes employed to truncate the gene at desired portions ofthe full-length sequence. Similarly, sequences of interest can beisolated directly from cells and tissues containing the same, usingknown techniques, such as phenol extraction and the sequence furthermanipulated to produce the desired CPX variants. See, e.g., Sambrook etal., supra, for a description of techniques used to obtain and isolateDNA.

The sequences encoding the CPX variants can also be producedsynthetically, for example, based on the known sequences. The nucleotidesequence can be designed with the appropriate codons for the particularamino acid sequence desired. The complete sequence is generallyassembled from overlapping oligonucleotides prepared by standard methodsand assembled into a complete coding sequence. See, e.g., Edge (1981)Nature 292:756; Nambair et al. (1984) Science 223:1299; Jay et al.(1984) J. Biol. Chem. 259:6311; Stemmer et al. (1995) Gene 164:49-53.

Recombinant techniques are readily used to clone sequences encoding CPXvariants useful in the claimed invention that can then be mutagenized invitro by the replacement of the appropriate base pair(s) to result inthe codon for the desired amino acid. Such a change can include aslittle as one base pair, effecting a change in a single amino acid, orcan encompass several base pair changes. Alternatively, the mutationscan be effected using a mismatched primer that hybridizes to the parentnucleotide sequence (generally cDNA corresponding to the RNA sequence),at a temperature below the melting temperature of the mismatched duplex.The primer can be made specific by keeping primer length and basecomposition within relatively narrow limits and by keeping the mutantbase centrally located. See, e.g., Innis et al, (1990) PCR Applications:Protocols for Functional Genomics; Zoller and Smith, Methods Enzymol.(1983) 100:468. Primer extension is effected using DNA polymerase, theproduct cloned and clones containing the mutated DNA, derived bysegregation of the primer extended strand, selected. Selection can beaccomplished using the mutant primer as a hybridization probe. Thetechnique is also applicable for generating multiple point mutations.See, e.g., Dalbie-McFarland et al. Proc. Natl. Acad. Sci USA (1982)79:6409.

Once coding sequences have been isolated and/or synthesized, they can becloned into any suitable vector or replicon for expression in bacteria.(See Examples). The invention also includes expression constructs forexpressing a given passenger polypeptide as an N-terminal fusionprotein, a C-terminal fusion protein, or biterminal fusion protein,i.e., linked or fused directly to the CPX protein present on theexternal surface of a bacterial cell. Display and expression of apassenger polypeptide as an N-terminal or C- terminal or biterminalfusion with a CPX variant is accomplished by topological permutation ofan OmpX protein as described in U.S. patent application Ser. No.10/920,244, which is herein incorporated by reference. Sequencerearrangement of an OmpX protein can be accomplished using overlapextension PCR methods known in the art in order to create either anN-terminal or C-terminal fusion construct, or alternatively, abiterminal fusion construct. See Ho, et al. (1989) Gene 77(1):51-59,which is herein incorporated by reference. As will be apparent from theteachings herein, a wide variety of vectors encoding CPX variantscoupled to one or more passenger polypeptides can be generated bycreating expression constructs which operably link, in variouscombinations, polynucleotides encoding CPX variants and passengerpolypeptides.

Numerous cloning vectors are known to those of skill in the art, and theselection of an appropriate cloning vector is a matter of choice.Examples of re-combinant DNA vectors for cloning include pBAD33, pB30D,pBR322, pACYC177, pKT230, pGV1106, pLAFR1, pME290, pHV14, pBD9, pIJ61,and pUC6. See, generally, DNA Cloning: Vols. I & II, supra; Sambrook etal., supra; B. Perbal, supra.

The gene can be placed under the control of a promoter, ribosome bindingsite (for bacterial expression) and, optionally, an operator(collectively referred to herein as “control” elements), so that the DNAsequence encoding the desired CPX variant and passenger polypeptide(s)is transcribed into RNA in the host cell transformed by a vectorcontaining this expression construction. The coding sequence may containa naturally occurring OmpX signal peptide sequence or a heterologoussignal sequence (e.g., from another outer membrane protein such as OmpA,OmpT, OmpC, OmpF, OmpN, LamB, FepA, FecA, or the like) to promoteexpression of the CPX variant at the surface of a bacterial host cell.

Other regulatory sequences may also be desirable which allow forregulation of expression of the protein sequences relative to the growthof the host cell. Such regulatory sequences are known to those of skillin the art, and examples include those which cause the expression of agene to be turned on or off in response to a chemical or physicalstimulus, including the presence of a regulatory compound (e.g., aregulatable promoter for controlled transcription).

In a preferred embodiment, a vector comprising the regulatable promoteraraBAD is used to control transcription. Expression and display of thepolypeptide is then accomplished by induction of protein expression bycontacting with arabinose, preferably for about 10 to about 60 minutes,and more preferably for about 10 to about 20 minutes at 25° C.Controlling expression and display minimizes potential avidity effectsthat can result from excessive surface concentration of the displayedpeptide.

Expression vectors of the present invention may also utilize a low copyorigin of replication (e.g., p15A) in order to minimize the metabolicburden on the bacterial host cell such that the clonal representation ofthe polypeptide library is not affected by growth competition duringlibrary propagation. Additionally, expression vectors of the presentinvention may include a selectable marker such as an antibacterialresistance gene to a bacteriocidal antibiotic (e.g., chloramphenicolacetyltransferase, beta lactamase, or the like).

The control sequences and other regulatory sequences may be ligated tothe coding sequence prior to insertion into a vector. Alternatively, thecoding sequence can be cloned directly into an expression vector thatalready contains the control sequences and an appropriate restrictionsite.

In some cases it may be necessary to modify the coding sequence so thatit may be attached to the control sequences with the appropriateorientation; i.e., to maintain the proper reading frame. Mutants oranalogs may be prepared by the deletion of a portion of the sequenceencoding the protein, by insertion of a sequence, and/or by substitutionof one or more nucleotides within the sequence. Techniques for modifyingnucleotide sequences, such as site-directed mutagenesis, are well knownto those skilled in the art. See, e.g., Sambrook et al., supra; DNACloning, Vols. I and II, supra; Nucleic Acid Hybridization, supra.

The expression vector is then used to transform an appropriate bacterialhost cell. A number of bacterial hosts are known in the art, includingbut not limited to, Escherichia coli, Shigella sonnei, Shigelladysenteriae, Shingella flexneri, Salmonella typhimurium, Salmonellaenterica, Enterobacter aerogenes, Serratia marcescens, Yersinia pestis,or Klebsiella pneumoniae, which will find use with the presentexpression constructs.

In preferred embodiments, a bacterial strain is chosen that is deficientin proteolytic machinery in order to prevent protein degradation SeeMeerman, H. J., Nature Biotechnol. 12(11):1107-1110, which is hereinincorporated by reference. In some embodiments, a bacterial strain thatmakes truncated or otherwise modified lipopolysacharides on its surfacemay be used to minimize steric effects upon binding to largebiomolecules including proteins, viruses, cells, and the like. In somepreferred embodiments, the bacterial host has a genotype that aids theexpression vector in regulating more tightly the production of thepolypeptide to be displayed. The bacterial host may be modified usingmethods known in the art, including random mutagenesis, DNA shuffling,genome shuffling, gene addition libraries, and the like.

As exemplified herein, Escherichia coli strain, MC1061 is a suitablebacterial host for display of passenger polypeptides using CPX variantsof the invention. The MC1061 strain exhibits (1) high transformationefficiency of greater than about 5×10⁹ per microgram of DNA, (2) a shortdoubling time, i.e., 40 minutes or less, during exponential growthphase, (3) high level display of the given polypeptide, and (4)effective maintenance of the expression ON and OFF states (see Example1).

In preferred embodiments, the expression vectors and libraries of thepresent invention incorporate (1) the use of a regulatable expressionvector that allows on-off control of the production of the CPX proteinor variant thereof, (2) efficient restriction sites immediately adjacentto a randomized site for insertion of cloned DNA encoding a randompassenger polypeptide fused to the N-terminus, C-terminus or bothterminii of the CPX variant to facilitate library construction, (3) timeand temperature-controlled induction periods to obtain optimal displaylevels that result in higher quality results, (4) the use of a bacterialstrain having a high plasmid transformation efficiency fortransformation, (5) the use of optimized library construction protocolsto construct large libraries, (6) the use of multiple-plasmidtransformation to yield a larger number of unique passenger polypeptidesfor a given number of host cells, (7) the use of cell concentration toenable complete processing of larger numbers of sequences (e.g., 10¹¹),or (8) any combination thereof.

In some embodiments of the present invention, a DNA library isconstructed containing preferably greater than about 10⁸ sequences, andpreferably more than about 10¹⁰ unique sequence members, using methodsknown in the art. This library size is preferred since library size hasbeen shown to correlate with the quality (affinity and specificity) ofthe selected sequences. See Griffiths, A. D. and D. S. Tawfik (2000)Curr. Opin. Biotechnol. 11(4):338-53, which is herein incorporated byreference.

In some embodiments, a polypeptide library may be prepared byintroduction and expression of nucleic acid sequences which encodepolypeptides having about 1 to about 1000, preferably about 2 to about30 amino acids in length. In certain embodiments, high DNAconcentrations of more than about 0.1 μg per μl are used duringtransformation such that the transformed host cell contains one or moreindependent plasmid molecules. Transformation with multiple plasmidsyields a larger number of unique peptides in the same volume of liquid,providing better overall results than when transformation is performedwith only one molecule per cell. In some embodiments, a mixture of aplurality of different expression vectors and/or plasmids may beemployed, for example, to allow cooperative binding of two differentdisplayed peptides on the same surface, or to present a protein havingmultiple subunits, and the like.

A desired number of polypeptides may be displayed for differentpurposes. As exemplified herein, the method of the present inventionutilizes an induction period of about 10 minutes to 6 hours to controltotal expression levels of the display polypeptide and the mode of thesubsequent screen or selection such that the level of expression has nomeasurable effect upon the cell growth rate. In some embodiments,shorter time periods may be used to reduce avidity effects in order toallow selection of high affinity monovalent interactions. As providedherein, the ability to control display speeds the process and yieldshigher quality results, e.g., sequences that bind to a target withhigher affinity.

In some embodiments, a cell concentration by a factor of about 10 may beused to enable complete processing of the entire pool of diversity in avolume of about 10 to about 100 ml. The library may be expanded bypropagation by a factor of more than about 100-fold under conditionswhich prevent synthesis of the library elements, for example, withglucose to repress araBAD or lac promoters, and aliquots of the librarymay be prepared to represent a number of clones which is more than aboutthree fold greater than the total number of library members.

For library selection, a subset of the total library, either randomlydivided, or chosen for specific properties could be used as a startingpoint for screening. Either MACS and/or FACS methods known in the artmay be used. Alternatively, methods known in the art that enablephysical retention of desired clones and dilution or removal ofundesired clones may be used. For example, the library may be grown in achemostat providing continuous growth, diluting out only those cellsthat do not bind to a capture agent retained in the vessel.Alternatively, hosts may be cultured with medium having ingredients thatpromote growth of desired clones.

Cell sorting instrumentation is applied as a quantitative libraryscreening tool to isolate the highest affinity clones from amagnetically enriched population. Two different approaches can beapplied for quantitative screening on the basis of either equilibriumbinding affinity (Equilibrium Screen) or dissociation rate constants(Kinetic Screen). See Daugherty, P. S., et al. (2000) J. Immunol.Methods 243(1-2):211-227; and Boder, E. T. and K. D. Wittrup (1998)Biotechnology Progress 14(1):55-62, which are herein incorporated byreference in their entireties. For equilibrium screening, cellpopulations are labeled with limiting concentrations of the targetproteins, and all cells exhibiting fluorescence intensities abovebackground autofluorescence are collected.

Instead of using random synthetic peptides to provide genetic diversity,fragment genomic DNA of varying lengths, cDNA of varying lengths,shuffled DNAs, and consensus generated sequences may be employed inaccordance with the present invention.

Non-natural amino acids having functionality not represented amongnatural amino acids, e.g., metal binding, photoactivity, chemicalfunctionality, and the like, may be displayed on the surface using asuitable bacterial host. In this case, the library or an equivalentlibrary may be transformed into strains engineered to producednon-natural amino acids. See Kiick, K. L. et al. (2001) FEBS Lett.502(1-2):25-30; Kiick, K. L., et al. (2002) PNAS USA 99(1):19-24;Kirshenbaum, K., et al. (2002) Chembiochem. 3(2-3):235-237; and Sharma,N., et al. (2000) FEBS Lett. 467(1):37-40, which are herein incorporatedby reference. Peptides incorporating non-natural amino acids areisolated by selection or screening for functions which require inclusionof the non-natural monomers into the displayed polypeptide.

Displayed polypeptides may be made to include post-translationmodifications, including glyocosylation, phosphorylation, hydroxylation,amidation, and the like, by introduction of a gene or set of genesperforming the desired modifications into the strain used for screeningand selection, e.g., MC1061 or comparable host strain. Genes performingsuch post-translational modifications may be isolated from cDNA orgenomic libraries by cotransformation with the library and screening forthe desired function using FACS or another suitable method. For example,post-translational glycosylation activities (enzymes) can be foundco-transforming.

The polypeptides displayed by CPX or a variant thereof preferablypossess a length that preserves the folding and export of the carrierprotein while presenting significant sequence and structural diversity.In some embodiments, the CPX or variant thereof used as a carrierprotein may be modified by rational redesign or directed evolution bythe methods described herein to increase levels of display or enhancepolypeptide presentation. For example, the linker between the native N-and C-terminii of OmpX may be optimized by random point or cassettemutagenesis and screened for enhanced presentation (see, e.g., Examples1-4). In addition, mutations may be incorporated into the CPX scaffoldthat increase the display efficiency of a passenger polypeptide (e.g.,substitutions at positions 165 and 166).

Terminal fusion display allows for high mobility of the surfacedisplayed molecule, increased accessibility to target molecules, andsimple proteolytic cleavage of the displayed peptide for production ofsoluble peptides. Terminal fusion display also enables theidentification of novel substrates and ligands, e.g., for proteases,peptidases, kinases, receptors, and antibodies. The expression vectorsaccording to the present invention provide a direct way for enhancingthe conformational diversity and surface mobility of surface anchoredpeptides and polypeptides. Through the increased mobility resulting fromterminal fusion (as opposed to insertional fusions), the apparentaffinity of a polypeptide binding to its corresponding target moleculeor material more closely resembles that of the peptide in solution. TheN-terminal or C-terminal or biterminal display vectors allow theretention of an energetically stable outer membrane protein structure,compatible with folding, transport, and assembly for efficient displayof a given passenger polypeptide on the bacterial cell surface.

In some embodiments, a cDNA library may be cloned into the displayposition of the N-terminal or C-terminal or biterminal fusion expressionvector, with a terminal affinity tag, such as a T7 tag epitope, or alabel, or the like, appended to a terminus of the cDNA clone allowingfor measurement of the total display level on the cell surface. As usedherein, the term “affinity tag” refers to a biomolecule, such as apolypeptide segment, that can be attached to a second biomolecule toprovide for purification or detection of the second biomolecule orprovide sites for attachment of the second biomolecule to a substrate.Examples of affinity tags include a poly-histidine tract, protein A(Nilsson et al. (1985) EMBO J. 4:1075; Nilsson et al. (1991) MethodsEnzymol. 198:3, glutathione S transferase (Smith and Johnson (1988) Gene67:31), Glu-Glu affinity tag (Grussenmeyer et al., (1985) PNAS USA82:7952), substance P, FLAG peptide (Hopp et al. (1988) Biotechnology6:1204), streptavidin binding peptide, or other antigenic epitope orbinding domain, and the like, (Ford et al. (1991) Protein Expression andPurification 2:950), all of which are herein incorporated by reference.As used herein, a “label” is a molecule or atom which can be conjugatedto a biomolecule to render the biomolecule or form of the biomolecule,such as a conjugate, detectable or measurable. Examples of labelsinclude chelators, photoactive agents, radioisotopes, fluorescentagents, paramagnetic ions, and the like.

The presence of surface localized proteins may be monitored using anantibody or reagent specific for the tag or label according to methodsknown in the art. Cells binding to a target protein may be then selectedusing MACS and/or FACS. The library pool may be incubated with afluorescent label of one color (such as green) and then a secondfluorescent label of a second color (such as red) to identify thepresence of a full length cDNA of interest. Clones which are red andgreen are then isolated from the library directly using cell sortingmethods known in the art.

In some embodiments, the polypeptides of an N-terminal, C-terminal, orbiterminal fusion expression vector may be isolated or purified from theouter surface of the host. In other words, a polypeptide may beexpressed using an N-terminal, C-terminal, or biterminal fusionexpression vector and then produced in a soluble form (free in solution)by introducing a suppressible codon downstream of the given polypeptide.Alternatively, a protease susceptible linker may be used in place of the“suppressible” codon. The polypeptides are displayed on the surface athigh density by induction, such as with arabinose for a period of about2 hours. The cells are washed once or twice in a compatible buffer, suchas PBS, to remove undesired proteins and other debris, the cells areconcentrated, and a protease is added to the cell suspension. Theproteolytically cleaved polypeptide is then harvested by removal of thebacteria by low-speed centrifugation, and transfer of the supernatantinto a fresh tube.

C. Applications

The present invention may be broadly applied to methods to isolate,enhance or otherwise alter, peptide and polypeptide sequences thatperform useful or desired functions including binding, catalysis,assembly, transport, and the like. For example, the expression vectorsof the present invention may be used to isolate peptide moleculartransformation catalysts, develop whole-cell reagents, discover peptidesthat promote self assembly, discover in vivo targeting peptides for drugand gene delivery, discover and increase peptides binding to materialssurfaces, e.g., semiconductors, mapping proteins such as proteincontacts, and biomolecular networks, identifying enzymesubstrates/inhibitors, identifying receptor agonists/antagonists,isolating inhibitors of bacterial or viral pathogenesis, discoveringpeptides that mediate endocytosis and cellular entry, mapping antibodyand protein epitopes including multiplex mapping, identifying peptidemimics of non-peptide ligands, and isolating metal binding peptides,e.g., for bioremediation, nano-wire synthesis, according to methodsknown in the art. See Georgiou, G., et al. (1997) Nat. Biotechnol.15(1):29-34; Pasqualini, R. and E. Ruoslahti (1996) Nature380(6572):364-366; Whaley, S. R., et al. (2000) Nature405(6787):665-668; Fields, S. and R. Sternglanz (1994) Trends inGenetics 10(8):286-292; Kim, W. C., et al. (2000) J. Biomol. Screen.5(6):435-440; Yang, W. P., et al. (1995) J. Mol. Biol. 254(3): 392-403;Poul, M. A., et al. (2000) J. Mol. Biol. 301(5):1149-1161; James, L. C.,et al. (2003) Science 299(5611):1362-1367; Feldhaus, M. J., et al.(2003) Nat. Biotechnol. 21(2):163-170; Kjaergaard, K., et al. (2001)Appl. Environ. Microbiol. 67(12):5467-5473, and Shusta, E. V., et al.(1999) Curr. Opin. Biotechnol. 10(2):117-122, which are hereinincorporated by reference in their entireties.

Thus, in one embodiment, CPX variants of the invention can be used indisplay libraries for screening polypeptides for biological activity. Apolypeptide display library, as described herein, is provided comprisingCPX variants carrying a plurality of passenger polypeptides displayed onbacterial cells. The polypeptides are contacted with a target moleculeof interest and assayed for biological activity in the presence of thetarget molecule in order to identify displayed passenger polypeptidesthat have biological activity. For this purpose, any CPX variantdescribed herein can be used in the polypeptide display libray forscreening polypeptides. The polypeptide display library can includepassenger polypeptides fused to the N- or C- or both terminii of the CPXvariants. The biological activity assayed can be enzymatic activity,substrate activity, ligand-binding activity, agonist activity,antagonist activity, transport activity, or any other biologicalactivity. Any target molecule can be chosen, including but not limitedto, a receptor, a ligand, an antibody, an antigen, an enzyme, atransporter, a substrate, an inhibitor, an activator, a cofactor, adrug, a nucleic acid, a lipid, a carbohydrate, a glycoprotein, a smallorganic molecule, or an inorganic molecule.

In certain embodiments, the invention includes a method of screening alibrary of polypeptides for the ability to bind to a target molecule,the method comprising: a) providing a polypeptide display librarycomprising CPX variants carrying a plurality of passenger polypeptidesdisplayed on bacterial cells, b) contacting the plurality of passengerpolypeptides with the target molecule, and c) identifying at least onedisplayed passenger polypeptide that binds to the target molecule.

The target molecule may comprise a detectable label in order tofacilitate detection of binding of the target molecule to the displayedpolypeptides. Detectable labels suitable for use in the presentinvention include any composition detectable by spectroscopic,photochemical, biochemical, immunochemical, electrical, optical, orchemical means. Useful labels in the present invention include biotin orother streptavidin-binding proteins for staining with labeledstreptavidin conjugate, magnetic beads (e.g., Dynabeads), fluorescentdyes (e.g., phycoerythrin, YPet, fluorescein, texas red, rhodamine,green fluorescent protein, and the like, see, e.g., Molecular Probes,Eugene, Oreg., USA), radiolabels (e.g., ³H, ¹²⁵I, ³⁵S, ¹⁴C, or ³²P),enzymes (e.g., horse radish peroxidase, alkaline phosphatase and otherscommonly used in an ELISA), and colorimetric labels such as colloidalgold (e.g., gold particles in the 40-80 nm diameter size range scattergreen light with high efficiency) or colored glass or plastic (e.g.,polystyrene, polypropylene, latex, etc.) beads. Patents teaching the useof such labels include U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350;3,996,345; 4,277,437; 4,275,149; and 4,366,241.

In some embodiments, the N-terminal, C-terminal, or biterminal fusionexpression vectors of the present invention can be used for theidentification of substrates, such as protease or kinase substrates,from substrate libraries. Accordingly, an expression vector may bemodified to express a fluorescent protein using methods known in theart. For example, the use of a bicistronic expression vector comprisinga CPX variant, (2) a ribosomal binding site down stream of the CPXvariant sequence, and (3) a label such as a green fluorescent proteinsuitable for efficient detection using fluorescence activated cellsorting (e.g., alajGFP). Expression is then monitored through theintensity of green fluorescence.

For example, a library of protease or peptide substrates is createdusing methods known in the art. The substrates are fused to theN-terminus or C-terminus or both terminii of CPX variants using anexpression vector expressing a green fluorescent protein. The substratelibrary is constructed such that a label or an affinity tag suitable forfluorescence labeling is fused to the free terminus of a passengerpolypeptide on the cell surface. Host cells expressing the substratelibrary labeled with a red fluorescent protein are grown, and cellswhich are green but not red are removed from the population to eliminatethe isolation of false positive clones. The library is then incubatedwith an enzyme (e.g., a protease or peptidase), and cells which loosered fluorescence while retaining green fluorescence are isolated fromthe population using FACS.

In some embodiments, the N-terminal, C-terminal, or biterminal fusionexpression vectors of the present invention may be used to constructwhole cells that can be used as reagents. For example, one or morepeptides identified using the methods herein, binding to a protein,virus, or cellular receptor, or synthetic composition of matter, aredisplayed on the outer surface of a bacterial cell at a desired surfacedensity. Cells can then be coupled directly to a material, e.g.,glass/silicon, gold, polymer, by virtue of peptides selected to bindthese materials, and used to capture in solution molecules binding tovarious other displayed peptides on the same cell. For opticaldetection, cells can co-express a fluorescent or luminescent reportermolecule such GFP, or luciferase. Flow cytometry, or fluorescencemicroscopy can be used to detect binding of molecular recognitionelement displaying cells to the target agent, e.g., virus, cell,particle, bead, and the like.

The polypeptide display systems of the present invention allow thecreation of renewable whole cell binding reagents in non-specializedlaboratories since this method is technically accessible and librariesare reusable. This approach has already proven useful for selectingcell-specific binding peptides, and for performing diagnostic assaysusing flow cytometry and fluorescence microscopy. Furthermore, thesurface displayed polypeptides can be used for parallel or multiplexligand isolation, and clones can be processed with efficient single-celldeposition units present on many cell sorters. See Feldhaus, M. J., etal. (2003) Nat. Biotechnol. 21(2):163-170, which is herein incorporatedby reference. Consequently, the expression vectors of the presentinvention may be used in proteomic applications including proteome-wideligand screens for protein-detecting array development See Kodadek, T.(2001) Chem. Biol. 8(2):105-115, which is herein incorporated byreference.

III. Experimental

Below are examples of specific embodiments for carrying out the presentinvention. The examples are offered for illustrative purposes only, andare not intended to limit the scope of the present invention in any way.

Efforts have been made to ensure accuracy with respect to numbers used(e.g., amounts, temperatures, etc.), but some experimental error anddeviation should, of course, be allowed for.

Example 1: Materials and Methods Bacterial Strains, Reagents andPlasmids

All experiments were performed with E. coli strain MC1061 (F- araD139Δ(ara-leu)7696 galE15 galK16 Δ(lac)X74 rpsL (StrR) hsdR2 (rK−mK +) mcrAmcrB1) (Casadaban, M. J. & Cohen, S. N. (1980) J. Mol. Biol., 138,179-207). All plasmid constructs utilize pBAD33 (Cm^(r)) (Guzman et al.(1995) J. Bacteriol., 177, 4121-4130), with the promoter araBAD operonand the p15A origin of replication (low-copy number). KOD HOT START DNApolymerase (Novagen) was used for PCRs. Primers were from Operon,restriction enzymes (New England BioLabs), streptavidin-R-phycoerythrin(SA-PE) (Molecular Probes), streptavidin-coated magnetic microbeads(MYONE streptavidin T1) (Invitrogen). Qiagen mini-preps and gelextraction kits were used for DNA preparation. Ni-NTA agarose forprotein purification was from Qiagen and B-PER II bacterial proteinextraction reagent was from Pierce Biotechnology.

Vector and Library Construction

Construction of circularly permutated OmpX (CPX) was describedpreviously (Rice et al. (2006) Protein Sci, 15, 825-836). To monitor thedisplay level, a streptavidin binding peptide was fused to theN-terminus of CPX as described earlier; this plasmid is termedpB33CPX-SApep. To generate the libraries that join the original N- andC-termini of OmpX with 3-6 random residues, PD1237-1240 was used as thereverse primer, and PD179 as the forward primer, with pB33CPX-SApep asthe template. The random positions were encoded using NNK codonsallowing for all amino acids and the amber stop codon. The product ofthe PCR reaction was gel purified then used as a forward primer for thenext reaction, using PD180 as a reverse primer, and again withpB33CPX-phage as the template. The product was then gel purified,digested with SfiI, and gel purified again. The digested insert wasligated into the similarly digested vector pB33CPX-SApep. Ligationproducts were desalted and electroporated into electro-competent MC1061yielding 7.5×10⁷, 7.5×10⁷, 1.5×10⁸, 5.0×10⁸ transformants respectively.

To create the second generation library (CPX-directed), primers PD1282with PD179 was used to randomize positions A165 and G166 usingpB33CPX-SApep as the template. The product was then used as a templatefor PCR with PD1281 and PD179, adding the second genertion libraryresidues. The primer encoded for G at the first position of the linkerand used a restricted codon of MRM to encode for residues RKSHQN atposition 3 and 6 of the linker, the remaining positions used NNK. Theproduct of the previous PCR reaction was then used as a forward primerfor the next reaction, using PD180 as a reverse primer, and again withpB33CPX-SApep as the template. The product was then digested and ligatedinto the similarly digested vector pB33CPX-SApep. Ligation products weredesalted and electroporated into electro-competent MC1061 yielding1.0×10⁹ transformants.

The various binding peptides were fused to the N-terminus of CPX andeCPX using a linker of GGQSGQ (SEQ ID NO:28). PCR was used withpB33CPX-SApep as the template, PD180 as the reverse primer, and withforward primers PD1192/PD1193 for the CRP binding peptide(EWACNDRGFNCQLQR, SEQ ID NO:29), and forward primers PD961/PD962 for theVEGF binding peptide (VEPNCDIHVMWEWECFERL, SEQ ID NO:30). The productswere digested with SfiI and ligated into similarly digested vector andelectroporated into MC1061. Primers PD1130-PD1133 were used in anassembly PCR to create the forward primer for the mini-Z-domain(FNMQQQRRFYEALHDPNLNEEQRNAKIKSIRDD, SEQ ID NO:31). This primer withPD180 and template pB33CPX-SApep was used in PCR, the product wasdigested with SfiI and ligated into similarly digested vector andelectroporated into MC1061. The CPX-T7 (MASMTGGQQMG, SEQ ID NO:32) wascreated using overlap extension PCR with the products from the PCRreaction with primers PD179/PD705 and PD180/PD706. The products weredigested with SfiI and ligated into similarly digested vector andelectroporated into MC1061. To transfer these peptides to eCPX thevectors containing the peptide CPX fusion was digested with PstI andKpnI, the smaller fragment was gel extracted and ligated to thesimilarly digested vector of eCPX, transferring the displayed peptide tothe eCPX plasmid. To insert the P2 peptide (PAPSIDRSTKPPL, SEQ ID NO:33)at the C-terminus of CPX and eCPX, PCR was used with PD179 as theforward primer and primers PD950/PD951 as the reverse primers withpB33CPX-SApep at template. The primers also encodes a linker of GGQSGQ(SEQ ID NO:28) preceding the P2 peptide. The products were digested withSfiI and ligated into similarly digested vector and electroporated intoMC1061. The gene is eCPX-nSApep-cP2. The streptavidin binding peptidewas removed using KpnI and HindIII and ligation with a similarly cutinsert that contains no N-terminally fused peptide, creating CPX-cP2.

To insert an extended linker of (GGGS)₅ (SEQ ID NO:34) between thestreptavidin binding peptide and eCPX-P2, PCR was used with forwardprimer PD179 and reverse primers PD1429/PD1430/PD31 withpB33eCPX-nSApep-cP2 as template. The product was then gel extracted andused as the forward primer with PD180 as the reverse primer andpB33eCPX-nSApep-cP2 as template. The product was gel extracted anddigested with SfiI and ligated to similarly digested vector. The portionafter the OmpX signal sequence is now, GQGGQ (encoding a SfiI site, SEQID NO:35), AECHPQGPPCIEGRK (the streptavidin binding peptide, SEQ IDNO:36), (GGGS)₅ (the additional linker, SEQ ID NO:34), GGQSGQ (originallinker, SEQ ID NO:28) followed by the S54 of eCPX with the P2 peptide onthe C-terminus, the gene construct is termed eCPX-nSApep-linker-cP2.

Magnetic Selection and Screening by FACS

Magnetic selections were preformed for the first round of selectionusing the libraries CPX-5×, CPX-6x, and CPX-directed. An overnightculture of cells corresponding to 5× the library diversity wereinoculated to LB medium containing 34 μg/mL chloramphenicol (Cm) for afinal cell concentration of 0.05 OD₆₀₀, or 100 μL of overnight culturesinto 5 mL LB Cm, which ever is greater. The cultures were then grown at37° C. to 0.5 OD₆₀₀ with shaking (250 rpm), at which time the culturewas moved to room temperature (22°) to equilibrate and then induced withL-arabinose to a final concentration of 0.04% (w/v). The cells wereinduced for 50 minutes, at room temperature, shaking (250 rpm). A volumeof cells corresponding to 5× the library diversity was concentrated bycentrifugation (3000×g, 4° C., 5 min) and resuspended in cold PBS to10-30 OD₆₀₀. MYONE SA beads (Dynal) were added to a ratio ofapproximately one bead per four cells. Magnetic separation was used towash the beads four times with a volume of LB equivalent to the volumeused in the initial labeling, and the beads plus bound cells werefinally resuspended in LB with Cm and 0.2% glucose (w/v) for overnightgrowth.

For flow cytometric sorting, 50 μL of overnight cultures of thelibraries were inoculated to 5 mL LB Cm. Cells were induced as describedin the previous paragraph, in future rounds of sorting the inductiontime was decreased to 30 minutes. Ten μL of cells were labeled with 100μL of 100 nM SA-PE in PBS on ice for 45 minutes, pelleted bycentrifugation, and the supernatant was removed. Cells were resuspendedin ice-cold PBS at approximately 10⁷ cells/mL and immediately analyzedand sorted using a FACSARIA cytometer with 488 nm excitation. Between 1and 5% of the most labeled cells were collected and amplified forfurther rounds of analysis and/or sorting by growing overnight in LBmedium containing glucose and Cm. A subset of the sort was plateddirectly on agar for isolation of single clones. Typically 4-10 selectedclones were assayed for antigen binding by flow cytometry, and theidentity of each peptide insert was determined by DNA sequencing.

Clonal Characterization

To compare the display level of CPX and eCPX as a function of time ofCPX and eCPX, cells were subcultured 1:50 from overnight stocks into 5mL LB Cm and grown for 2 hours shaking (250 rpm) at 37° C. The cellswere then moved to room temperature (22° C.) to equilibrate and inducedwith 0.04% (w/v) L-arabinose still shaking at 250 rpm. Five μL sampleswere taken prior to induction, 30, 60, and 90 minutes after inductionthen added to 50 μL of 100 nM SA-PE in PBS and incubated on ice for 45minutes. After which the cells were centrifuged (3000 g, 5 min),supernatant removed, and resuspended in 500 μL ice cold PBS. Cells wereimmediately analyzed with a FACSARIA using 488 nm excitation andcollected fluorescence data at 576 nm.

To compare the display of various peptides using CPX and eCPX, cultureswere started using 50 μL of overnight culture in 5 mL LB Cm. Culturesexpressing the mini-Z domain, SA binding peptide, P2 peptide, and T7epitope were grown until an OD₆₀₀ of 0.4 and moved to room temperature(22°) to equilibrate and then induced with L-arabinose to a finalconcentration of 0.04% (w/v) for 30 minutes, and two hours for themini-Z domain. Cultures expressing the CRP and VEGF peptides wereinduced at 37° C. at an of OD₆₀₀ 0.4 for 30 minutes. Five μL of inducedcells were added to 50 μL of PBS containing the respective antigens atthe following concentrations: YPet-Mona 50 nM, biotinylated VEGF 65 nM,biotinylated CRP 100 nM, Alexa labeled human IgG 300 nM, SA-PE 100 nM,anti-T7·tag monoclonal IgG 6.7 nM. Samples were labeled on ice for 45minutes. Biotinylated samples were spun down at 3000 g for 5 minutes andsupernatant removed. Cells were resuspended in 50 μL of PBS with 10 nMSA-PE and put on ice for 45 minutes. Before cytometric analysis sampleswere spun down at 3000 g for 5 minutes, supernatants removed, and 500 μLof PBS added to resuspend the cells. Samples were excited with at 488nm, fluorescence data was collected at 576 nm for SA-PE labeled samplesand 530 nm for Alexa labeled samples and YPet conjugated samples.

For the dual labeling experiments, cells were subcultured 1:50 fromovernight stocks into 5 mL LB Cm and grown for 2 hours shaking (250 rpm)at 37° C. The cells were then induced with 0.04% (w/v) L-arabinose.Cells expressing eCPX-nSApep-cP2 were expressed for 25 minutes at 37°C., and cells expressing eCPX-nSApep-linker-cP2 were expressed for 45minutes at 37° C. Five μL of cells were labeled with 50 μL of PBS with100 nM SA-PE only, 40 nM Ypet-Mona only, or with both probessimultaneously. The cells were incubated at room temperature for 45minutes, centrifuged at 3000 g for 5 minutes, and supernatant removed.The cells were left on ice before resuspension with 500 μL ice cold PBSand analyzed using cytometry with 488 nm excitation and measuring 576 nmand 530 nm emission.

Example 2: Circularly Permuted OmpX Libraries

The construction of a circularly permuted outer membrane protein OmpX(CPX) for use as a protein scaffold for polypeptide display wasdescribed previously (see U.S. patent application Ser. No. 10/920,244,which is herein incorporated by reference in its entirety). One of theadvantages of CPX is that both the N- and C-termini are exterior to thecell, which allows polypeptides to be displayed from either terminus.The CPX protein scaffold consists of the native OmpX signal sequence,which is cleaved after translocation; a sequence with an embedded SfiIrestriction site (GQSGQ) (SEQ ID NO: 35) after which peptides may beinserted; a flexible linking sequence (GGQSGQ) (SEQ ID NO:28); aminoacids S54-F148 of the mature OmpX; a GGSG (SEQ ID NO:2) linker joiningthe native C- and N-termini; and finally, amino acids A1-S53 of themature OmpX.

In order to assess the extent of peptide display, a disulfideconstrained streptavidin binding peptide (SApep) with the followingamino acid sequence, AECHPQGPPCIEGRK (SEQ ID NO:36) (Giebel et al.(1995) Biochemistry, 34, 15430-15435), was fused to the N-terminus ofCPX allowing cell labeling using a fluorescently-conjugated streptavidinprobe and measurement of display levels using cytometry. CPX yielded areduced level of peptide display when compared to cells displaying thesame peptide presented as an insertion within the corresponding regionof OmpX (FIGS. 1A, 1B). This reduced display level has the drawback thatlonger induction periods are required to allow for sufficientfluorescence labeling and library screening, thereby causing cell stressthat can result in growth biases during library selection (Daugherty etal. (1999) Protein Eng, 12, 613-621).

The CPX scaffold was constructed using an arbitrarily chosen flexiblelinker (GGSG) (SEQ ID NO:2) to join the native N- and C-termini. Thus,alternative linkers and point mutations within CPX could enhance thedisplay of peptides on the cell surface. In order to enhance the displaycharacteristics of CPX, various regions of the transmembrane proteinwere targeted for mutagenesis. An optimized linker region joining thenative N- and C-termini was identified by generating and screening fourlibraries allowing for three (3×), four (4×), five (5×), and six (6×)random residues to be inserted in place of the GGSG (SEQ ID NO:2) linkerusing the degenerate codon NNK. Each library was screened separatelyusing FACS for clones exhibiting a high level of fluorescence after 50minutes of induction, indicating increased display of SApep binding tostreptavidin-R-phycoerythrin (SA-PE). Under these conditions, the parentCPX scaffold yielded display levels only slightly greater thanbackground autofluorescence after 50 minutes of expression, making theselection of mutants more efficient. After sorting, the display level ofseveral clones was measured using cytometry, and their sequences weredetermined by DNA sequencing (Table 1).

TABLE 1 Sequence Clones from Five Selected Linker Libraries RelativePositions Display Clone 165 166 Linker level CPX A G GGSG  2.3 ^(a)/(SEQ ID NO: 2)  1.5 ^(b) Three residue linker library CPX-3X-1 A G GRK 8.9 ^(a) (SEQ ID NO: 3) CPX-3X-2 A G GRK  8.3 ^(a) (SEQ ID NO: 3)CPX-3X-3 A G GTK  7.1 ^(a) (SEQ ID NO: 4) CPX-3X-4 A G GKK 10 ^(a)(SEQ ID NO: 5) Four residue linker library CPX-4X-1 A G GSKR 18 ^(a)(SEQ ID NO: 6) CPX-4X-2 A G GRQK 14 ^(a) (SEQ ID NO: 7) CPX-4X-3 A GSWPN 15 ^(a) (SEQ ID NO: 8) CPX-4X-4 V G PRKS 22 ^(a) (SEQ ID NO: 9)Five residue linker library CPX-5X-1 A G GRTRK 24 ^(a) (SEQ ID NO: 10)CPX-5X-2 A G GRKRN 22 ^(a) (SEQ ID NO: 11) CPX-5X-3 V G GATRR 32 ^(a)(SEQ ID NO: 12) CPX-5X-4 A S GSQSK 36 ^(a) (SEQ ID NO: 13)Six residue linker library CPX-6X-1 A G GTKRYH 35 ^(a) (SEQ ID NO: 14)CPX-6X-2 A G GRRHYK 28 ^(a) (SEQ ID NO: 15) CPX-6X-3 A G GNRRHR 24 ^(a)(SEQ ID NO: 16) CPX-6X-4 A S GSKQSK 38 ^(a) (SEQ ID NO: 17)Second generation library CPX-L2-1 L S GSKSRR 33 ^(b) (SEQ ID NO: 18)CPX-L2-2 F S GRKNSH 19 ^(b) (SEQ ID NO: 19) CPX-L2-3 I S GTRGSQ 29 ^(b)(SEQ ID NO: 20) CPX-L2-4 L S GHRSHR 27 ^(b) (SEQ ID NO: 21) CPX-L2-5 I SGDRKRR 28 ^(b) (SEQ ID NO: 22) CPX-L2-6 V A GARGRH 24 ^(b)(SEQ ID NO: 23) CPX-L2-7 V S GTHNSQ 26 ^(b) (SEQ ID NO: 24) CPX-L2-8 V SGPNKSR 17 ^(b) (SEQ ID NO: 25) CPX-L2-9 I S GPHNSR 23 ^(b)(SEQ ID NO: 26) CPX-L2-10 I S HRGYHAQR 33 ^(b) (SEQ ID NO: 27) ^(a) Foldfluorescence above background after 50 minutes of expression ^(b) Foldfluorescence above background after 25 minutes of expression

Isolated clones exhibited three to fifteen-fold enhanced displaycompared to CPX after only a 30 minute induction period. The identifiedlinker sequences exhibited a preference for basic residues, and glycinewas present at the first position of the linker in 14 of 16 clonescharacterized. In addition, four of the clones had mutations precedingthe native C-terminus; two with the substitution A165V and two otherswith G166S. These four clones were among the most efficient displayscaffolds isolated. The average display level of the selected clonesfrom each library increased with increasing linker length, and washighest for 5- and 6-mer linker clones.

In order to further enhance peptide display, the amino acid linker thatjoins the passenger peptide to the N-terminus of the display scaffoldwas also targeted for mutagenesis and screening for enhanced variants. Alibrary was created in place of the original linking sequence GGQSGQ(SEQ ID NO:28), by randomizing these six residues. Screening yieldedfour clones exhibiting a ten-fold more efficient display as compared toCPX. Sequencing did not reveal a consensus within the target linkerregion. Instead, clones with enhanced display possessed a non-targetedmutation of either A165V or G166S. Since the randomly selected librarymembers from the initial pool did not possess mutations outside of theintended region, these advantageous substitutions were rare and likelyarose from PCR errors.

In parallel, a library was generated with random residues at the surfaceexposed C-terminus of CPX, since native outer membrane proteins (Omps)possess a conserved C-terminal motif that is thought to aid in Ompmembrane insertion or assembly (Bos, M. P. & Tommassen, J. (2004) Curr.Opin. Microbiol., 7, 610-6). Four clones were isolated from this librarythat exhibited more efficient peptide display. Again, these variants didnot share consensus in the randomized region, but each carried thespontaneous mutation G166S. These results suggest that the amino acidcompositions of the new termini derived from circular permutation havelittle effect on the rate of assembly and display of CPX, whereasresidues A165 and G166 play a key role in proper translocation andinsertion of the protein.

In order to combine display-enhancing mutations identified within themost efficient clones from the previous libraries, a final library wasdesigned. A six residue linker library was chosen to connect the nativeN- and C-termini because the longer linker typically allowed more rapiddisplay as compared to the five residue linker (Table 1). The firstamino acid of the linker was fixed to glycine since it was highlyconserved in the isolated clones. The third and sixth positions wererestricted to R/K/S/H/Q/N using the codon MRM, given the increasedfrequency of these residues at the proposed position in clones withenhanced function, and the remaining three positions were fullyrandomized. Positions A165 and G166, where beneficial substitutions wereobserved, were also fully randomized. Enhanced variants were identifiedusing two rounds of MACS followed by two rounds of FACS, sorting clonesexhibiting the highest display of the SApep after 30 minutes ofinduction. Ten clones were isolated after the final FACS screening(Table 1). All variant scaffolds identified possessed a more bulkyhydrophobic residue (I/L/V/F) in place of alanine at position 165, aconsensus for serine at position 166, and a high frequency of basicresidues Arg and Lys within the linking region. The display enhancingsubstitutions A165/G166 are located immediately upstream of the nativeC-terminus of OmpX (FIG. 2).

Example 3: Expression Characteristics of Optimized CPX

The scaffold variant exhibiting the most enhanced displaycharacteristics, CPX-L2-1, or enhanced CPX (eCPX), was then compared toCPX and OmpX. The cell surface display level of SApep was measured atincremental times after induction of expression for these threescaffolds. This peptide was displayed either at the N-terminus (CPX andeCPX) or as an insertional fusion within the second extracellular loopof OmpX. The level of display was measured before and after 30, 60, and90 minutes of induction using flow cytometry (FIGS. 1A-1C). The displayrate of eCPX was substantially increased relative to that of CPX, andeven slightly higher than that of OmpX. After only 30 minutes ofexpression, the level of display of eCPX-nSApep was 50-fold abovebackground autofluorescence. Introducing A165L and G166S into OmpXresulted in nearly identical display of SApep relative to that obtainedwith OmpX (data not shown).

In order to determine whether the enhanced display using the eCPXscaffold is a general effect or specific to the streptavidin bindingpeptide SApep, several unrelated passenger peptides were fused to theN-terminus of CPX and eCPX and their display levels were compared (FIG.3). Surprisingly, a disulfide-constrained peptide binding to C-reactiveprotein (CRPpep) (EWACNDRGFNCQLQR) (SEQ ID NO:29), displayed with eCPXyielded nearly 50-fold higher florescence labeling than that for CPX,after only 30 minutes of expression. In fact, the fluorescence of cellsdisplaying CRPpep from CPX could not be distinguished from background(FIG. 3). Similarly, the T7·tag epitope (MASMTGGQQMG) (SEQ ID NO:32) wasdisplayed more efficiently from eCPX than from CPX. Adisulfide-constrained 19-mer peptide binding to vascular endothelialgrowth factor (VEGF) identified previously using phage display(Fairbrother et al. (1998) Biochemistry, 37, 17754-17764) was alsodisplayed over three-fold more efficiently within eCPX. Additionally, anIgG binding mini-protein (a minimized version of the Z-domain fromprotein A (Braisted, A. C. & Wells, J. A. (1996) Proc. Natl. Acad. Sci.USA, 93, 5688-5692, composed of 33-amino acids that form twoantiparallel α-helices, exhibited a display level roughly three-foldhigher than that from CPX after two hour induction period. Finally, P2,a proline-rich peptide (PAPSIDRSTKPPL) (SEQ ID NO:33) known to bind tothe C-terminal SH3 domain of Mona (Harkiolaki, et al. 2003), wasexpressed as a C-terminal fusion using both CPX and eCPX. Similar to theincreased efficiency of display at the N-terminus, the display levelafter only 30 minutes of expression of P2 using eCPX was improved byninefold compared to display with CPX. Thus, for all peptidesinvestigated, the eCPX scaffold increased display levels when comparedto the parental CPX.

Example 4: Biterminal Display with eCPX

Two distinct peptides were simultaneously displayed on the structurallyadjacent N- and C-termini of eCPX. SApep was fused to the N-terminus,and the P2 peptide fused to the C-terminus (eCPX-nSApep-cP2). Labelingwith fluorescent probes SA-PE (red) and YPet-Mona (Nguyen, A. W. &Daugherty, P. S. (2005) Nat. Biotechnol., 23, 355-360) (green) enabledindependent detection of each peptide using flow cytometry. To determinethe ability for eCPX to simultaneously display these two peptides, cellsexpressing the biterminal display scaffold were labeled with SA-PE only,YPet-Mona only, or both probes concurrently. If the peptides bind totheir respective receptors independently (i.e., without any stericclashes), there should not be a difference between the extent of singlecolor labeling (fluorescence intensity) of the sample labeled with oneprobe and that labeled with both fluorescent probes simultaneously.

However, simultaneous labeling of cells expressing a fusion protein ofthe form N′-SApep-eCPX-P2-C′ with SA-PE and YPet-Mona, or with eachprobe separately, yielded differing extents of labeling, consistent withsteric interference between these two large fluorescent probes (290 kDand 34 kD, respectively). Specifically, the fluorescence of the cellswhen labeled with only one probe was always greater than thefluorescence in the corresponding channel of the cells when labeled withboth probes simultaneously. In an attempt to reduce steric interference,a long flexible linker of the form (GGGS)₅ (SEQ ID NO:34) was insertedbetween SApep and eCPX, resulting in a total linker length of 26 aminoacids causing SApep to be further from the cell surface and thusincreasing the distance between the two peptides. Using this longlinker, independent labeling of each displayed peptide was enhanced(FIG. 4).

These results indicate that the use of a long, unstructured linker canincrease the accessibility of large proteins to peptides simultaneouslydisplayed at both termini of eCPX, without substantially reducing thelevel of display.

Thus, in order to identify CPX scaffold variants with optimal linkersequences for joining the native C- and N-termini of OmpX, four separatelibraries with three, four, five or six random linker amino acids werescreened using MACS and FACS. Enhanced variants revealed a preferencefor longer linkers of five to six residues, a strict consensus forglycine at the first position of the linker, and an abundance of basicresidues in the remaining positions. Substitutions (A165V, G166S) nearthe native C-terminus of OmpX greatly increased the display level, andprobably arose from rare errors introduced during PCR which wereenriched from the large libraries (10⁹). Based on enhanced variants fromthe initial libraries, a final library was designed with a six residuelinker that included restricted positions based on the previousselections and a randomization of positions A165 and G166. Afterscreening, the variant exhibiting the most improvement in displaycharacteristics was named eCPX, and carried substitutions A165L andG166S, with a linker sequence of GSKSRR (SEQ ID NO:18).

The eCPX variant has been shown to increase the display rate of variouspolypeptide insertions, on either the N- or C-terminus as compared tothe parental CPX. This allows for library screens to be more efficientand less biased towards peptides that are difficult to display.

Also, eCPX has the flexibility to display peptides on either the N- orC-terminus, which is important for many protein binding interactionssuch as PDZ domains, which preferentially interact with C-terminalpeptides (Harris, B. Z. & Lim, W. A. (2001) J. Cell Sci., 114,3219-3231).

In addition to displaying single peptides on one terminus, two peptidescan be displayed simultaneously on opposite termini of eCPX. Thisbiterminal display has numerous advantages including the ability toquantify to the amount of eCPX displayed on the cell surface or allowfor a screen of libraries on both termini simultaneously. To validatebiterminal display, a variant of eCPX displaying N-terminal SApep andC-terminal P2pep, labeled simultaneously with two different proteintargets demonstrated that dual labeled cells had similar fluorescentlevels as individually labeled cells in the corresponding channels whenlonger linkers were inserted to avoid steric hindrances of binding bothproteins in close proximity. The quantification of the display levelduring library screening by labeling of a C-terminal peptide such asP2pep allows for peptides with a high affinity but low display level tobe differentiated from peptides with a high display but moderateaffinity. Moreover, biterminal display allows for the possibility ofcreating peptide libraries on each terminus where both peptides can bindto separate regions of the same protein target, causing increasedbinding affinity and specificity through avidity.

The molecular engineering of eCPX has created a circularly permutedtransmembrane protein that has both termini facing the exterior of thecell and inserts into the outer membrane as efficiently as thenon-permuted variant. The amino acid sequence used to join the terminiplayed a major role in the proper function of CPX, and minor changeselsewhere in the protein aided in display of peptides using eCPX. Thisknowledge can be applied to improve other circularly permuted proteinsto create variants that are as stable as the wild-type variant but withthe termini at a chosen position. Moreover, this unique protein scaffoldallows for multiple possibilities for the display of polypeptides andengineering the surface of E. coli. and has advanced the robustness ofbacterial surface display.

Thus, methods for bacterial display of proteins and peptides usingcircularly permuted OmpX (CPX) variants containing optimized linkers andselected mutations at positions 165 and 166 are disclosed. Althoughpreferred embodiments of the subject invention have been described insome detail, it is understood that obvious variations can be madewithout departing from the spirit and the scope of the invention asdescribed herein.

1.-42. (canceled)
 43. A method of screening a library of polypeptidesfor the ability to bind to a target molecule, the method comprising: a)providing a polypeptide display library comprising a circularly permutedOmpX (CPX) variant carrying a plurality of passenger polypeptidesdisplayed on bacterial cells; b) contacting the plurality of passengerpolypeptides with a target molecule, and c) identifying at least onedisplayed passenger polypeptide that binds to the target molecule,wherein the CPX variant comprises: a linker joining the nativeN-terminus and native C-terminus of OmpX, wherein the CPX variantcomprises a non-native N-terminus and a non-native C-terminus, whereinthe linker is 4-8 residues in length and comprises a sequence X-Z_(n),wherein X is an amino acid selected from the group consisting of:serine, threonine, and proline, wherein each Z is independently anyamino acid, and n is 2 to 4, and wherein at least one of the Z residuesis independently selected from the group consisting of: lysine,arginine, glutamine, asparagine, or histidine.
 44. The method of claim43, wherein the target molecule is selected from the group consisting ofa receptor, a ligand, an antibody, an antigen, an enzyme, a transporter,a substrate, an inhibitor, an activator, a cofactor, a drug, a nucleicacid, a lipid, a carbohydrate, a glycoprotein, a small organic molecule,and an inorganic molecule.
 45. The method of claim 43, wherein saidtarget molecule comprises a detectable label, wherein identifying thetarget molecule bound to at least one passenger polypeptide comprisesdetecting the label attached to said target molecule.
 46. The method ofclaim 43, wherein the first residue of the linker of the CPX variant isa glycine.
 47. The method of claim 43, wherein the linker of the CPXvariant comprises at least two basic residues.
 48. The method of claim43, wherein the linker of the CPX variant comprises two arginineresidues.
 49. The method of claim 43, wherein the linker of the CPXvariant comprises two lysine residues.
 50. The method of claim 43,wherein the linker of the CPX variant comprises at least one arginineresidue and at least one lysine residue.
 51. The method of claim 43,wherein the linker of the CPX variant is 5 residues in length.
 52. Themethod of claim 43, wherein the linker of the CPX variant is 6 residuesin length.
 53. The method of claim 52, wherein the first residue of thelinker of the CPX variant is a glycine and the third and sixth residuesof the linker of the CPX variant are selected from the group consistingof arginine, lysine, serine, histidine, glutamine, and asparagine. 54.The method of claim 43, wherein the CPX variant comprises one or moremutations that increase the display efficiency of a passenger peptidecompared to the CPX variant in the absence of the mutations, wherein atleast one mutation is at a position corresponding to A165 or G166 of thenative OmpX protein consisting of SEQ ID NO:
 1. 55. The method of claim54, wherein the CPX variant comprises one or more mutations selectedfrom the group consisting of an A165V mutation, an A165L mutation, anA1651 mutation, an A165F mutation, a G166S mutation, a G166A mutationand combinations thereof.
 56. The method of claim 43, wherein the CPXvariant comprises a passenger polypeptide fused to the non-nativeN-terminus of the CPX variant.
 57. The method of claim 56, wherein theCPX variant comprises a linker between the non-native N-terminus of theCPX variant and the passenger polypeptide.
 58. The method of claim 43,wherein the CPX variant comprises a passenger polypeptide fused to thenon-native C-terminus of the CPX variant.
 59. The method of claim 58,wherein the CPX variant comprises a linker between the non-nativeC-terminus of the CPX variant and the passenger polypeptide.
 60. Themethod of claim 43, wherein the CPX variant comprises a first passengerpolypeptide fused to the non-native N-terminus of the CPX variant and asecond passenger polypeptide fused to the non-native C-terminus of theCPX variant.
 61. The method of claim 60, wherein the first passengerpolypeptide or the second passenger polypeptide comprises a detectablelabel.
 62. The method of claim 61, wherein both the first passengerpolypeptide and the second passenger polypeptide comprise detectablelabels.
 63. The method of claim 62, wherein the first passengerpolypeptide comprises a different detectable label than the secondpassenger polypeptide.
 64. The method of claim 43, wherein the linker ofthe CPX variant comprises an amino acid sequence selected from the groupconsisting of: SEQ ID NOs: 6, 8-10, 12-14, 17, 18, 20, 21, and 24-26.